CN115115389A - Express customer loss prediction method based on value subdivision and integrated prediction - Google Patents

Express customer loss prediction method based on value subdivision and integrated prediction Download PDF

Info

Publication number
CN115115389A
CN115115389A CN202210236263.7A CN202210236263A CN115115389A CN 115115389 A CN115115389 A CN 115115389A CN 202210236263 A CN202210236263 A CN 202210236263A CN 115115389 A CN115115389 A CN 115115389A
Authority
CN
China
Prior art keywords
customer
value
prediction
loss
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210236263.7A
Other languages
Chinese (zh)
Other versions
CN115115389B (en
Inventor
孙哲
曹艺译
孙知信
赵学健
汪胡青
宫婧
胡冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202210236263.7A priority Critical patent/CN115115389B/en
Publication of CN115115389A publication Critical patent/CN115115389A/en
Application granted granted Critical
Publication of CN115115389B publication Critical patent/CN115115389B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides an express delivery customer loss prediction method based on value subdivision and integrated prediction, which comprises a customer value subdivision module, a loss prediction and early warning module and a personalized saving module, wherein the customer value subdivision module is used for customer value measurement and calculation and customer classification; the loss prediction and early warning module comprises a website customer loss prediction module and a single customer loss rate prediction module and is used for predicting whether customers lose and loss rate; the early warning and personalized saving module is used for providing personalized saving schemes for clients with different values according to the influence index system of client loss and the value importance degree of the clients. The invention can accurately classify the customers, can predict whether the customers lose or not, the loss probability and the loss amount of the customers at the network points with high precision, and provides personalized loss early warning according to the prediction result.

Description

Express customer loss prediction method based on value subdivision and integrated prediction
Technical Field
The invention relates to an express customer loss prediction method based on value subdivision and integrated prediction, and belongs to the technical field of logistics and machine learning.
Background
Because the express service industry in China starts relatively late, related service concepts and marketing management modes cannot well adapt to the development requirements of the market. When an operation strategy is formulated facing a client, different strategies can be more hopefully implemented aiming at different clients, and accurate operation is realized. The premise of the precise operation is customer relationship management, and the core of the customer relationship management is customer classification. By means of customer classification, customer groups can be subdivided, low-value customers and high-value customers can be distinguished, different personalized services can be provided for different customer groups, limited resources can be reasonably distributed to the customers with different values, and benefit maximization is achieved.
This patent relates to the application of the following algorithm:
the RFM model is an important tool and means to measure customer value and customer profitability. The mechanical model describes the value condition of a client through 3 indexes of the latest consumption time R, the consumption frequency F and the consumption amount M of the client.
The meta-heuristic algorithm is an algorithm inspired by biological behaviors and physical phenomena, and the core idea of the meta-heuristic algorithm is to realize the balance of random behaviors and local search in the search process. In solving a plurality of multi-modal, discrete and non-differential realistic optimization problems, the meta-heuristic algorithm presents excellent operability and optimization capability and is successfully applied to various scientific fields.
The inspiration of the chimpanzee optimization algorithm (ChOA) comes from an optimization algorithm derived from the hunting behavior of chimpanzees in nature, which take different actions to search for prey according to division. The method is derived from simulation of chimpanzee individual intelligence, sexual motivation and predation behavior in nature, and an effective optimization scheme is constructed through the processes of driving, chasing, attacking and the like. The standard ChOA algorithm classifies chimpanzee populations into four types: attackers, handicappers, repellers and chasers, wherein the attackers are the leaders of the population, other three classes of chimpanzees assist the hunting, and the social status declines sequentially.
The sine and cosine algorithm belongs to a novel nature-imitated optimization algorithm, and solves an optimization problem by creating a plurality of random candidate solutions and utilizing a sine and cosine mathematical model. The sine mechanism can enable global search to find an optimal solution, reduce optimization blind spots of the cosine mechanism, reduce individuals from falling into local optimization, enable local development to fill up the defect that the speed of convergence of the global search of the sine is full, improve exploration capacity and accelerate algorithm convergence. The mutual use of sine and cosine can well balance the exploration and development capability of the algorithm and promote the optimization of the performance of the algorithm together.
Gaussian variation is another variation operation method for improving the local search performance of the genetic algorithm on key search areas. When the mutation operation is performed, the original gene value is replaced by a random number conforming to a normal distribution with the mean value being the variance. From the characteristics of the normal distribution, it is known that the gaussian variation is also an important search for a local region near the original individual. Gaussian variation involves adding a random value to create a new offspring from the gaussian distribution for each element of the individual's vector.
The ensemble learning algorithm is a machine learning method in which a series of learners are used for learning, and learning results are integrated by using a certain rule, so that a better learning effect is obtained than that of a single learner. The ensemble learning can be used for classification problem integration, regression problem integration, feature selection integration, abnormal point detection integration and the like, and the figure of the ensemble learning can be seen in all machine learning fields.
Disclosure of Invention
The invention aims to provide an express customer loss prediction method based on value subdivision and integrated prediction, which is innovated and improved aiming at the problems of influencing customer loss factors, individual requirements of different customers, accurate operation of enterprises and the like.
The technical scheme of the invention is as follows: an express delivery customer loss prediction method based on value subdivision and integrated prediction comprises a customer value subdivision module, a loss prediction and early warning module and a personalized saving module,
the client value subdivision module is used for client value measurement and calculation and client classification, an LSRMT client value subdivision model is designed by adopting an improved RFM model, relevant indexes are introduced, initial grade division is carried out on index values, then the index weights are determined according to a dual-target constraint model of the index weights, and finally the final value scores of clients are calculated by summing the index value indexes to realize the classification of the clients;
the loss prediction and early warning module comprises a network point customer loss prediction module and a single customer loss rate prediction module, wherein the network point customer loss prediction module comprises the construction of an influence index system of customer loss, the improvement of a chimpanzee optimization algorithm and the prediction of loss by using an improved chimpanzee optimization algorithm and an XGboost fused customer loss prediction model; the single client attrition rate prediction module mainly comprises client information system construction and an integrated learning model prediction single client attrition rate, new characteristic attributes are generated based on original behavior data of clients, a client information system is constructed, multiple integrated learning models are used as a base prediction classifier, partial characteristics are selected as attribute feature subsets to train the base prediction classifier, then weights of sub-models are trained through a linear classifier, and finally whether the clients are attrited or not and the attrition rate prediction is made according to weighting results;
the early warning and personalized saving module is used for providing personalized saving schemes for clients with different values according to an influence index system of client loss and the value importance degree of the clients, and the model establishes a target constraint model of the influence index of queuing time and the client loss, so that more real and credible data support is provided for enterprises on the premise of realizing the minimum loss of the clients.
Further, the express delivery customer churn prediction method based on value subdivision and integrated prediction includes: the customer value segmentation module comprises the following steps:
step 1: defining the following client value indexes, namely client relationship duration L, client sending activity S, client receiving activity R, average client cost M and client trust T, and dividing the indexes into an initial grade x according to a sorted data set j
Figure RE-GDA0003808959460000031
Wherein j is 1,2,3,4, 5; a represents a lower threshold, b represents an upper threshold, and the specific value is selected reasonably according to the actual data set in a box;
step 2: measuring and calculating value information VI of the value index according to the selected customer sample data ij Substituting data into value-based information VI ij Uncertainty and index weight W j And a sample weight component w ij And (3) solving a preferred weight by using a double-target constraint optimization model with minimum consistency:
Figure RE-GDA0003808959460000032
value information uncertainty objective function:
Figure RE-GDA0003808959460000033
weight consistency objective function:
Figure RE-GDA0003808959460000034
constraint (sum of index weights is 1):
Figure RE-GDA0003808959460000035
wherein m is the selected customerNumber of samples of index, W j Weight, w, of the j-th index ij A weight component representing the ith sample, the jth index;
step 3: calculating the value index V of the current index of the client according to the initial value score and the weight assigned by the index j And is used for expressing the score of the client at each index layer, and the calculation formula is as follows:
V j =x j ×W j
value index V according to various indexes of client j And summing to obtain a total value score V sum
Figure RE-GDA0003808959460000041
Step 4: scoring a total value of the customer V sum The customers are ranked and ranked into value classes, and classified into core value customer user1, general value customer user2, and potential value customer user 3.
Further, the express delivery customer churn prediction method based on value subdivision and integrated prediction includes: the network customer loss prediction module comprises the following steps
(1) Constructing an index system influencing customer loss;
(2) constructing an improved chimpanzee optimization algorithm, and training a model to obtain related parameters;
(3) constructing and optimizing a BSGChOA _ XGboost model for training;
(4) and predicting the client loss under different indexes.
Further, the express delivery customer churn prediction method based on value subdivision and integrated prediction includes: the step (1) comprises the following steps that the index system influencing the customer loss is a 3-layer index system set comprising a target layer, a criterion layer and an index layer.
Further, the express delivery customer churn prediction method based on value subdivision and integrated prediction includes: the step (2) comprises the following steps,
step 1: initializing relevant parameter settings for optimizing a chimpanzee algorithm;
step 2: generating an initial population, performing improved Bernoulli chaotic mapping on the position of the initial population, introducing random variable factors to improve the uniform distribution of the initial population, generating a chaotic sequence in a [0, 1] interval through a chaotic mapping relation, and then converting the chaotic sequence into a search space of an individual to generate the initial population;
Figure RE-GDA0003808959460000042
wherein i represents the current population scale, k represents the variable serial number of the chaotic mapping,
Figure RE-GDA0003808959460000043
expressing the k mapping function value;
step 3: calculating the fitness of each individual in the chimpanzee population, selecting the first 5 individual positions with the optimal fitness and respectively recording the positions as X attacker 、X observer ,X chaser ,X barrier ,X driver
Step 4: updating because of the convergence factor f (t) and coefficient vectors a and c,
Figure RE-GDA0003808959460000051
a=2×f(t)×R 3 -f(t) (9)
c=2×R 5 (10)
wherein R is 1 、R 3 、R 5 Is [0, 1]]Random factor of between, T max Is the maximum iteration number, k is an adjusting factor, and k belongs to [1, 5]];
Step 5: investigator
Figure RE-GDA0003808959460000053
Position updating, other individuals need to judge whether to take hunting action according to the current position information of the investigator, if the current prey arresting success rate P arrested Greater than the minimum arrest rate P min Then, the attacker, the handicapped, the driver and the chaser immediately take hunting action, the inspector continuously searches the next hunter, the position information of the inspector and the other chimpanzee individuals is updated according to the mapped initialization population information and the position relationship between the inspector and the other chimpanzee individuals,
Figure RE-GDA0003808959460000052
wherein, P arrested Indicating the current arrest rate, P min Represents the minimum arrest rate, R 2 Represents [0, 1]]The SND is a random number which obeys standard normal distribution;
step 6: after the positions of the investigators are updated, the rest individuals enter a search iteration stage, and the random factor R is updated firstly 4 Judging whether to enter global search or local search currently, if R is 4 If the value is more than or equal to epsilon, entering a global search stage; in the global search stage, introducing a self-adaptive factor w which changes along with the iteration number, updating the positions of the chimpanzee individuals in the population according to a factor change curve,
w=α(cosh(πt/T max )+δ) (12)
X 1 =w 1 *{X attacker -a 1 |C 1 X attacker -m 1 X|} (13)
X 2 =w 2 *{X chaser -a 2 |C 2 X chaser -m 2 X|} (14)
X 3 =w 3 *{X barrier -a 3 |C 3 X barrier -m 3 X|} (15)
X 4 =w 4 *{X driver -a 4 |C 4 X driver -m 4 X|} (16)
X 5 =w 5 *{X observer -a 5 |C 5 X observer -m 5 X|} (17)
Figure RE-GDA0003808959460000061
step 7: if R is 4 If the time is less than epsilon, entering a local search stage, introducing a judgment logic of iteration times in the local search stage to judge whether the current iteration times t is less than the specified iteration times t or not in order to prevent the local optimization possibly occurring in the early and late stages of the algorithm iteration * If yes, updating the conversion parameter beta, and updating the position information of the attacker by using an improved sine and cosine algorithm; otherwise, the Gaussian variation is performed on the position of the attacker,
after introducing sine and cosine algorithm, the position updating formula of the attacker in the population is as follows:
Figure RE-GDA0003808959460000062
wherein, X attacker (t) denotes the position of the aggressor in the t-th iteration, p 1 、ρ 2 、ρ 3 Is a random number, p 1 ∈[0,2π],ρ 2 ∈[0,2],ρ 3 ∈[0,1]P (t) represents the position of the current optimal individual, β is a conversion parameter, and the calculation formula is:
Figure RE-GDA0003808959460000063
wherein, beta max ,β min Respectively, the maximum and minimum values of the transformation parameter, T max Is the maximum number of iterations in the sequence,
the mathematical model of gaussian variation for the attacker position is as follows:
X’ attacker =X×[1+k×N(0,1)] (21)
wherein, X' attacker Is the updated position vector, X is the position vector of the current individual chimpanzee, k is [0, 1]]N (0, 1) is a gaussian distributed random vector with a mean of 0 and a variance of 1;
step 8: calculating new fitness according to the obtained new solutionAnd optimal individual and position information are obtained, whether the current algorithm meets the iteration termination condition is checked, and if the maximum iteration time T is reached max Then the optimal position is terminated and output, otherwise Step4 is returned and re-executed.
Further, the express delivery customer churn prediction method based on value subdivision and integrated prediction includes: when the XGBoost model is constructed in the step (3), the timing sequence survey data of the client loss under various influence indexes is respectively used as the main characteristic input, the corresponding influence indexes influencing loss are used as labels, and the optimal tree structure model is established by taking the values of the parameters of the chimpanzee optimization algorithm model to minimize the objective function, which specifically comprises the following steps:
step 1: in the BSGChOA _ XGboost model training stage, selecting a fitness function of a chimpanzee individual as the prediction accuracy of the model, firstly, randomly initializing the population number, and setting the initial value and the value range of each parameter of the model;
step 2: and selecting part of sample data in the criterion layer influencing the index set as a training set for model training and parameter optimization, and calculating a fitness function value of the model, wherein the function value represents an optimal solution obtained by each operation of the chimpanzee optimization algorithm. The residual samples are used as a test set to carry out final evaluation on the performance of the model, a training set is subjected to sampling prediction, then the prediction results are averaged to obtain a final prediction result,
step 3: verifying the trained model by using the test set, evaluating whether the value of each parameter reaches the current optimal value according to the fitness function value, and if so, replacing the original parameter; otherwise, continuing to keep the current parameters;
step 4: and (4) checking whether the algorithm meets an iteration termination condition, if the algorithm reaches the maximum iteration times, terminating and outputting the optimal values of all parameters in the iteration process, and if not, returning and executing Step2 again.
Further, the express customer churn prediction method based on value subdivision and integrated prediction comprises the following steps: the single customer attrition rate prediction module comprises the steps of:
step 1: generating various data characteristics of the client and constructing a client information system;
step 2: according to the information of the customer information system, a two-layer integrated learning algorithm is used for predicting the attrition rate of a single customer,
a first layer:
1) selecting n base classifiers and marking;
2) respectively selecting feature attribute sets F 1 ={b 1 ,b 2 ,...,b 5 },F 3 ={b 8 ,b 9 ,...,b 17 Generating data sets D1 and D2, the first m base classifiers respectively use the data sets D1 and D2 as input sets for training the training sample set to respectively obtain m prediction results P k Selecting F 4 ={b 18 ,b 19 ,...,b 23 Taking the predicted results P as input data of the remaining n-m classifiers to obtain n-m predicted results P k
3) Aiming at the training results of the n individual models, algorithm integration is carried out, and n-dimensional feature vector P ═ P (P) is constructed 1 ,P 2 ,P 3 ,...,P n ) T
A second layer:
1) the output characteristic vector P is used as the input of a linear classifier, and the weight of each type of model in the integrated model is learned through a gradient descent method j Obtaining the final prediction result P based on the n base prediction models * And then, by weighting the final probabilities of customer churn,
Figure RE-GDA0003808959460000081
wherein, Churn j (u, i) represents the loss prediction probability, weight, generated by the jth model j Representing the weight assigned to the jth model.
Further, the express delivery customer churn prediction method based on value subdivision and integrated prediction includes: the data feature in Step1 is derived from the customer base feature F 1 Order characteristics F 2 Customer, client-order interaction feature F 3 And order liveness feature F 4 Is prepared by the method (1).
Further, the express delivery customer churn prediction method based on value subdivision and integrated prediction includes: the customer saving scheme design module comprises the following steps:
step 1: referring to the customer attrition rate and the customer attrition amount predicted by the prediction model, the customer with the highest attrition probability is revisited first, and the preferential activity of old user regression is provided;
step 2: for queuing time length L s The influence index influencing the customer loss of the network point aims to ensure that the waiting time of the customer does not exceed the longest service time which can be borne by the customer and the service intensity of service personnel does not exceed the maximum service intensity, constructs a target constraint model of the customer loss and waiting time, the service intensity and the team length, provides more real and credible data support for enterprises on the premise of realizing the minimum customer loss,
the objective function is:
Figure RE-GDA0003808959460000082
the constraint conditions are as follows:
Figure RE-GDA0003808959460000083
where p is the average service strength,
ρ max for the maximum service intensity that the service personnel can withstand,
W s is the average waiting time of the customer,
W s-max for the maximum residence time that the customer can endure,
L s is the average team length
The invention combines the LSRMT model, the improved chimpanzee optimization algorithm and the ensemble learning algorithm, can accurately classify the clients, can predict whether the clients lose or not, the loss probability and the client loss amount of network points with high precision, and provides personalized loss early warning according to the prediction result.
Drawings
FIG. 1 is a block diagram of a method for predicting the loss of an express customer based on value breakdown and integrated prediction;
FIG. 2 is a flow diagram of a customer value segmentation module;
FIG. 3 is a flow chart of a modified chimpanzee optimization algorithm (BSGChOA) based on a hybrid strategy;
FIG. 4 is a flow diagram of a BSGChOA _ XGboost model prediction module;
FIG. 5 is a flow diagram of an integrated predictive model based on a customer information system;
FIG. 6 is a set of index systems affecting customer churn in the express industry;
FIG. 7 is a set of customer attribute features constructed based on feature engineering.
Detailed Description
In order to make the implementation purpose, technical scheme and advantages of the invention clearer, the technical scheme of the invention is clearly and completely described in the following steps with the combination of the attached drawings.
As shown in the attached drawing 1, the express customer loss prediction method based on value subdivision and integrated prediction provided by the invention comprises a customer value subdivision module and a loss prediction and early warning and personalized saving module, wherein the customer subdivision is realized by extracting customer value index data and measuring and calculating value scores, and then loss prediction and early warning are respectively carried out on each class of customers. The churn prediction module comprises churn rate prediction of a single client and churn amount prediction of network point clients under different influence indexes. And then according to the prediction result, the current value situation and the future loss situation of the customers with different values are considered from different influence index dimensions, and an early warning scheme of customer loss is provided for the enterprise. The customer saving scheme design module provides personalized saving schemes for customers with different values according to key influence indexes influencing the customer loss and the value importance degree of the customers. And a target constraint model influencing index queuing time and customer loss is designed in a key mode, the optimal queuing time is found, the enterprise can be guaranteed to obtain a certain service volume, the minimum loss cost of the customer loss is met, and more real and credible data support is provided for the enterprise.
As shown in fig. 2, the customer value segmentation module is mainly used for customer value estimation and customer classification. In order to better understand the customer value, the LSRMT customer value subdivision model is designed by improving the RFM model. And the client relation duration, the sending activity, the receiving activity, the average consumption amount and the client trust index are introduced, the index values are subjected to initial grade division, then the index weights are determined according to a dual-target constraint model of the index weights, and finally the final value scores of the clients are calculated through summing the index value indexes, so that the classification of the clients is realized.
As shown in fig. 3 and 4, the website customer loss prediction module includes the construction of an influence index system of customer loss, the improvement of a chimpanzee optimization algorithm, and the loss prediction using an improved chimpanzee optimization algorithm (bsgchoaa) and an XGBoost fused customer loss prediction model. Referring to the current state of development of the express industry, a 3-layer index system set of a target layer, a criterion layer and an index layer is constructed as shown in fig. 6. In the model training process, a statistical data set of the customer loss is introduced, parameters of a training model are optimized and selected by using an improved BSGChOA algorithm, and a BSGChOA _ XGboost model between the statistical data set of the customer loss and key factors influencing customer loss is established.
As shown in fig. 5, the single customer attrition rate prediction module mainly includes a customer information system construction and an integrated learning model to predict the single customer attrition rate, and as shown in fig. 7, the module generates a new feature attribute based on the original behavior data of the customer, and constructs a customer information system from a plurality of dimensions such as a customer basic feature, an order feature, a customer order interaction feature, an order activity degree, and the like. And adopting various integrated learning models as a base prediction classifier, selecting the customer basic characteristics and customer order interaction characteristics as an attribute characteristic subset to train the base prediction classifier, then training the weight of a submodel through a linear classifier, and finally making prediction on whether customers lose and the loss rate according to a weighting result.
The early warning and personalized saving module is used for providing personalized saving schemes for clients with different values according to an influence index system of client loss and the value importance degree of the clients, and designing a target constraint model of the influence index of queuing time and the client loss at a certain point, so that more real and credible data support is provided for enterprises on the premise of realizing the minimum loss of the clients.
The customer value segmentation module comprises the following steps:
step 1: selecting client data information corresponding to a client value index (LSRMT) from a client data set of past operation history of an enterprise, wherein the client relation duration L selects a time interval from first order placement to current ordering of a client, and the unit is day; selecting the number of successful sending pieces in the last 1 month of the client according to the client sending piece liveness S; selecting the successful receiving number of the client in the last 1 month by the client receiving activity R; the average expense amount M of the client selects the total expense amount of the client in the last month to be divided by the total number of the mails; the client trust degree T selects the total times of the clients for canceling the mails and the addressees in the midway to express the degree of the dependence trust of the clients on the enterprises. Then, the indexes are subjected to initial grade division x according to the sorted data set j
Figure RE-GDA0003808959460000101
Wherein j is 1,2,3,4, 5; a represents a lower threshold, b represents an upper threshold, and the specific value is selected reasonably according to the actual data set.
Step 2: measuring and calculating value information VI of the value index according to the selected customer sample data ij Because the index weight will depend on the current value information VI of the index ij The size is distributed to better describe the index value VI in the sample data ij Uncertainty of (d) and currently selected sample weight component w ij And an index weight W j More accurately calculates the weighted value of the client index, and substitutes the data into the value-based information VI ij Uncertainty and index weight W j And sample weight scoreQuantity w ij And (4) solving a better weight by the double-target constraint optimization model with the minimum consistency.
Figure RE-GDA0003808959460000111
Value information uncertainty objective function:
Figure RE-GDA0003808959460000112
weight consistency objective function:
Figure RE-GDA0003808959460000113
constraint (sum of index weights is 1):
Figure RE-GDA0003808959460000114
wherein m is the number of samples of the selected customer index, W j Weight, w, of the j-th index ij The weight component representing the jth index of the ith sample.
Step 3: calculating the value index V of the current index of the client according to the initial value score and the weight assigned by the index j And is used for expressing the score of the client at each index layer, and the calculation formula is as follows:
V j =x j ×W j
value index V according to various indexes of client j Summing to obtain a total value score V sum The larger the total score is, the larger the customer value is considered to be. Thereby obtaining the total value score V of the current client sum
Figure RE-GDA0003808959460000121
Step 4: according toCustomer Total value score V sum The customers are ranked and ranked into value classes, and classified into core value customer user1, general value customer user2, and potential value customer user 3.
After the classification of the customer values is realized, the customer needs to be subjected to loss prediction. In order to improve the prediction accuracy of the prediction algorithm and avoid the algorithm from falling into local optimum in the search stage, the scheme optimizes the parameters of the training model by using an improved chimpanzee optimization algorithm (BSGChOA) based on a hybrid strategy. The loss prediction module for the network point customer comprises the following steps:
step 1: initializing relevant parameter setting of optimized chimpanzee algorithm, setting the population scale N of the chimpanzee to be 30, and setting the maximum iteration number T of the algorithm max 500, maximum value β of conversion parameter max 10, minimum value beta min 1, minimum arrest rate P min 0.7, the number of iterations t is specified * 200 and a switching coefficient e 0.4.
Step 2: and generating an initial population. Performing improved Bernoulli chaotic mapping on the position of the initial population, introducing random variable factors to improve the uniform distribution of the initial population, generating a chaotic sequence in a [0, 1] interval through a chaotic mapping relation, and then converting the chaotic sequence into an individual search space to generate the initial population.
Figure RE-GDA0003808959460000122
Wherein i represents the current population scale, k represents the variable serial number of the chaotic mapping,
Figure RE-GDA0003808959460000123
representing the function value of the k-th mapping.
Step 3: calculating the fitness of each individual in the chimpanzee population, selecting the first 5 individual positions with the optimal fitness and respectively recording the positions as X attacker 、X obswrver ,X chaser ,X barriwr ,X drivwr
Step 4: and in order to balance the exploration and development capacity of the algorithm, a nonlinear convergence factor f (t) which dynamically changes along with the number of iterations is introduced and then substituted into a formula to successively update the coefficient vectors a and c in the chimpanzee search stage.
Figure RE-GDA0003808959460000131
a=2×f(t)×R 3 -f(t) (9)
c=2×R 5 (10)
Wherein R is 1 、R 3 、R 5 Is [0, 1]]Random factor of between, T max Is the maximum iteration number, k is an adjusting factor, and k belongs to [1, 5]]。
Step 5: investigator
Figure RE-GDA0003808959460000133
And (4) updating the position. Other individuals need to determine whether to take hunting actions according to the current position information of the investigator. If the current prey arrest success rate P arrested Greater than the minimum arrest rate P min Then, the attacker, the obstacle, the driver and the chaser immediately take hunting action, the inspector continuously searches the next hunter and updates the position information according to the mapped initialization population information and the position relationship between the inspector and other chimpanzee individuals.
Figure RE-GDA0003808959460000132
Wherein, P arrested Indicating the current arrest rate, P min Represents the minimum arrest rate, R 2 Represents [0, 1]]And SND is a random number that follows a standard normal distribution.
Step 6: after the positions of the investigators are updated, the rest individuals enter a search iteration stage, and the random factor R is updated firstly 4 Judging whether to enter global search or local search currently, if so, judging whether to enter global search or local searchR 4 If the value is more than or equal to epsilon, entering a global search stage. In the global search stage, an adaptive factor w which changes along with the iteration number is introduced, and the positions of the individual chimpanzees in the population are updated according to a factor change curve.
w=α(cosh(πt/T max )+δ) (12)
X 1 =w 1 *{X attacker -a 1 |C 1 X attacker -m 1 X|} (13)
X 2 =w 2 *{X chaser -a 2 |C 2 X chaser -m 2 X|} (14)
X 3 =w 3 *{X barrier -a 3 |C 3 X barrier -m 3 X|} (15)
X 4 =w 4 *{X driver -a 4 |C 4 X driver -m 4 X|} (16)
X 5 =w 5 *{X observer -a 5 |C 5 X observer -m 5 X|} (17)
Figure RE-GDA0003808959460000141
Step 7: if R is 4 If epsilon, entering a local searching stage. In the local search stage, in order to prevent local optimization which may occur before and after the iteration of the algorithm, a judgment logic of the iteration times is introduced, and whether the current iteration times t is less than the specified iteration times t or not is judged * If yes, updating the conversion parameter beta, and updating the position information of the attacker by using an improved sine and cosine algorithm; otherwise, Gaussian mutation is carried out on the position of the attacker. The sine and cosine mechanism can make up the defect of low convergence speed of global search in a local search stage, improve the search capability and accelerate the convergence of the algorithm. The Gaussian variation can ensure that the curve shows larger fluctuation at the later stage of iteration and the local optimal limitation is quickly broken out. At this stage, the remaining individual chimpanzees still change position according to the original position update formula.
After introducing sine and cosine algorithm, the position updating formula of the attacker in the population is as follows:
Figure RE-GDA0003808959460000142
wherein, X attacker (t) denotes the position of the aggressor in the t-th iteration, ρ 1 、ρ 2 、ρ 3 Is a random number, ρ 1 ∈[0,2π],ρ 2 ∈[0,2],ρ 3 ∈[0,1]And p (t) represents the position of the currently optimal individual. Beta is a conversion parameter. The calculation formula is as follows:
Figure RE-GDA0003808959460000143
wherein, beta max ,β min Respectively, a maximum and a minimum of the transformation parameter, T max Is the maximum number of iterations.
The mathematical model of gaussian variation for the attacker position is as follows:
X’ attacker =X×[1+k×N(0,1)] (21)
wherein, X' attacker Is the updated position vector, X is the position vector of the current individual chimpanzee, k is [0, 1]]With decreasing variables, N (0, 1) is a gaussian distributed random vector with a mean of 0 and a variance of 1.
Step 8: calculating new fitness, optimal individual and position information according to the obtained new solution, checking whether the current algorithm meets the iteration termination condition, and if the maximum iteration time T is reached max Then it terminates and outputs the optimal position, otherwise Step4 is returned and re-executed.
Step 9: and constructing a BSGChOA _ XGboost model for training, wherein when constructing the XGboost model, time sequence survey data of customer loss under various influence indexes are respectively used as main characteristic input, the corresponding influence indexes influencing loss are used as labels, and the optimal tree structure model is established by taking values of all parameters of the BSGChOA optimization model to minimize an objective function. Comprises the following steps
Step9.1: in the BSGChOA _ XGboost model training stage, selecting a fitness function of a chimpanzee individual as the prediction accuracy of the model, firstly, randomly initializing the population number, and setting the initial value and the value range of each parameter of the model; the learning rate learning _ rate default value is initially 0.30, the value range is 0.05-0.30, the minimum loss function reduction value gamma default value required by node splitting is 0, the value range is 0-0.20, the maximum depth max _ depth default value of the tree is 6, the value range is 4-10, the weight of the minimum leaf node sample and the min _ child _ weight default value are 1, the value range is 1-10, the weight lambda default value of the L2 regularization term is 1, and the value range is 0.1-10.
Step9.2: and selecting 80% of sample data in the criterion layer B influencing the index set for model training and parameter optimization, and calculating a fitness function value of the model, wherein the fitness function value represents an optimal solution obtained by running the chimpanzee optimization algorithm each time. And finally evaluating the performance of the model by taking the remaining 20 percent as a test set, performing sampling prediction on the training set, and averaging the prediction results to obtain a final prediction result.
Step9.3: evaluating whether the value of each parameter reaches the current optimal value or not according to the fitness function value by using a test set verification model, and if so, replacing the original parameter; otherwise, the current parameters are continuously kept.
Step9.4: and (4) checking whether the algorithm meets an iteration termination condition, if the algorithm reaches the maximum iteration times, terminating and outputting the optimal values of all parameters in the iteration process, and if not, returning and re-executing Step9.2.
The prediction module for a single customer attrition rate comprises the following steps:
step 1: from customer base features F 1 Order characteristics F 2 Customer-order interaction feature F 3 And order liveness feature F 4 And generating various data characteristics of the client in the behavior characteristics to construct a client information system.
Step 2: and predicting the loss rate condition of a single client by using a two-layer integrated learning algorithm according to the information of the client information system.
A first layer:
1) and selecting n base classifiers and marking. Here, n is generally in the range of [3-5 ]. Too small n results in insufficient prediction accuracy, and too large n results in higher algorithm complexity and cost.
2) Respectively selecting feature attribute sets F 1 ={b 1 ,b 2 ,...,b 5 },F 3 ={b 8 ,b 9 ,...,b 17 Generate data sets D1 and D2. The first m base classifiers respectively use the data sets D1 and D2 as input sets for training the training sample set to respectively obtain m prediction results P k . Because F 4 Most indirectly reflect the likelihood of customer churn, so option F 4 ={b 18 ,b 19 ,...,b 23 Taking the predicted results P as input data of the remaining n-m classifiers to obtain n-m predicted results P k
3) Aiming at the training results of the n individual models, algorithm integration is carried out, and n-dimensional feature vector P ═ P (P) is constructed 1 ,P 2 ,P 3 ,...,P n ) T
A second layer:
1) the output characteristic vector P is used as the input of a linear classifier, and the weight of each type of model in the integrated model is learned through a gradient descent method j The final prediction result based on the n base prediction models is P * . The final probability of customer churn is then predicted by weighting.
Figure RE-GDA0003808959460000161
Wherein, Churn j (u, i) represents the loss prediction probability, weight, generated by the jth model j Representing the weight assigned to the jth model.
The customer saving scheme design module comprises the following steps:
step 1: based on the above analysis, the operation steps of customer value measurement and value classification have been implemented, and customer churn rate and customer churn amount at the network site are predicted based on the historical data of customer consumption. In order to provide early warning and saving schemes for different value customer loss for enterprises, customer loss rate and customer loss amount predicted by referring to the prediction model are referred, customers with the highest loss probability are visited again first, and preferential activities of old user regression are provided.
Step 2: for queuing time length L s The influence index influencing the customer loss of the network points aims to ensure that the waiting time of customers does not exceed the longest service time which can be born by the customers and the service intensity of service personnel does not exceed the maximum service intensity, constructs a target constraint model of the customer loss, the waiting time, the service intensity and the team length, provides more real and credible data support for enterprises on the premise of realizing the minimum loss of the customers,
an objective function:
Figure RE-GDA0003808959460000162
constraint conditions are as follows:
Figure RE-GDA0003808959460000171
where p is the average service strength,
ρ max for the maximum service intensity that the service personnel can withstand,
W s is the average waiting time of the customer,
W s-max for the maximum residence time that the customer can tolerate,
L s is the average team length.
The invention combines the LSRMT model, the improved chimpanzee optimization algorithm and the ensemble learning algorithm, can accurately classify the clients, can predict whether the clients lose or not, the loss probability and the client loss amount of network points with high precision, and provides personalized loss early warning according to the prediction result.
Of course, various modifications and alterations of this invention may be made by those skilled in the art without departing from the spirit and scope of this invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (9)

1. An express customer loss prediction method based on value subdivision and integrated prediction is characterized by comprising the following steps: comprises a customer value subdivision module, a loss prediction and early warning module and a personalized saving module,
the client value subdivision module is used for client value measurement and calculation and client classification, an LSRMT client value subdivision model is designed by adopting an improved RFM model, relevant indexes are introduced, initial grade division is carried out on index values, then the index weights are determined according to a dual-target constraint model of the index weights, and finally the final value scores of clients are calculated by summing the index value indexes to realize the classification of the clients;
the loss prediction and early warning module comprises a network point customer loss prediction module and a single customer loss rate prediction module, wherein the network point customer loss prediction module comprises the construction of an influence index system of customer loss, the improvement of a chimpanzee optimization algorithm and the prediction of loss by using an improved chimpanzee optimization algorithm and an XGboost fused customer loss prediction model; the single client attrition rate prediction module mainly comprises client information system construction and an integrated learning model prediction single client attrition rate, new characteristic attributes are generated based on original behavior data of clients, a client information system is constructed, multiple integrated learning models are used as a base prediction classifier, partial characteristics are selected as attribute feature subsets to train the base prediction classifier, then weights of sub-models are trained through a linear classifier, and finally whether the clients are attrited or not and the attrition rate prediction is made according to weighting results;
the early warning and personalized saving module is used for providing personalized saving schemes for different value customers according to an influence index system of customer loss and the value importance degree of the customers, and the model establishes a target constraint model of the influence index of queuing time and the customer loss, so that more real and credible data support is provided for an enterprise on the premise of realizing minimum loss of the customers.
2. The express customer churn prediction method based on value breakdown and integrated prediction according to claim 1, wherein: the customer value segmentation module comprises the following steps:
step 1: defining the following customer value indexes, the customer relation duration L, the customer sending activity S, the customer receiving activity R, the average customer spending amount M and the customer trust level T, and performing initial grade division on the indexes according to the sorted data set
Figure DEST_PATH_IMAGE002
Figure DEST_PATH_IMAGE004
(1)
Wherein j =1,2,3,4, 5; a represents a lower threshold, b represents an upper threshold, and the specific value is selected reasonably according to the actual data set in a box;
step 2: measuring and calculating value information of value index according to selected customer sample data
Figure DEST_PATH_IMAGE006
Substituting data based on value information
Figure 68081DEST_PATH_IMAGE006
Uncertainty and index weight
Figure DEST_PATH_IMAGE008
And sample weight component
Figure DEST_PATH_IMAGE010
And (3) solving a preferred weight by using a double-target constraint optimization model with minimum consistency:
Figure DEST_PATH_IMAGE012
(2)
value information uncertainty objective function:
Figure DEST_PATH_IMAGE014
(3)
weight consistency objective function:
Figure DEST_PATH_IMAGE016
(4)
constraint conditions are as follows:
Figure DEST_PATH_IMAGE018
(5)
wherein m is the number of samples of the selected customer index,
Figure 611320DEST_PATH_IMAGE008
the weight of the jth index is represented,
Figure 532003DEST_PATH_IMAGE010
a weight component representing the ith sample, the jth index;
step 3: calculating the value index of the current index of the client according to the initial value score and the weight assigned by the index
Figure DEST_PATH_IMAGE020
And is used for expressing the score of the client at each index layer, and the calculation formula is as follows:
Figure DEST_PATH_IMAGE022
value index according to each index of customer
Figure 943524DEST_PATH_IMAGE020
Summing to obtain a total value score
Figure DEST_PATH_IMAGE024
Figure DEST_PATH_IMAGE026
(6)
Step 4: scoring by customer total value
Figure 164421DEST_PATH_IMAGE024
And sorting the customers and grading the value.
3. The express customer churn prediction method based on value breakdown and integrated prediction of claim 1, wherein the express customer churn prediction method comprises the following steps: the network customer loss prediction module comprises the following steps of (1) constructing an index system influencing customer loss;
(2) constructing an improved chimpanzee optimization algorithm, and training a model to obtain related parameters;
(3) constructing and optimizing a BSGChOA _ XGboost model for training;
(4) and predicting the client loss under different indexes.
4. The express customer churn prediction method based on value breakdown and integrated prediction as claimed in claim 3, wherein the step (1) comprises the step that the index system influencing the customer churn is a 3-layer index system set comprising a target layer, a criterion layer and an index layer.
5. The express customer churn prediction method based on value breakdown and integrated prediction as claimed in claim 3, wherein the step (2) comprises the following steps,
step 1: initializing relevant parameter settings for optimizing a chimpanzee algorithm;
step 2: generating an initial population, performing improved Bernoulli chaotic mapping on the position of the initial population, introducing random variable factors to improve the uniform distribution of the initial population, generating a chaotic sequence in a [0, 1] interval through a chaotic mapping relation, and then converting the chaotic sequence into a search space of an individual to generate the initial population;
Figure DEST_PATH_IMAGE028
(7)
wherein i represents the current population scale, k represents the variable serial number of the chaotic mapping,
Figure DEST_PATH_IMAGE030
expressing the k mapping function value;
step 3: calculating the fitness of each individual in the chimpanzee population, selecting the first 5 individual positions with the optimal fitness and recording the positions as
Figure DEST_PATH_IMAGE032
Figure DEST_PATH_IMAGE034
Figure DEST_PATH_IMAGE036
Step 4: update the convergence factor
Figure DEST_PATH_IMAGE038
And the coefficient vectors a and c are combined,
Figure DEST_PATH_IMAGE040
(8)
Figure DEST_PATH_IMAGE042
(9)
Figure DEST_PATH_IMAGE044
(10)
wherein,
Figure DEST_PATH_IMAGE046
the maximum number of iterations, k is the adjustment factor,
Figure DEST_PATH_IMAGE048
step 5: investigator
Figure DEST_PATH_IMAGE050
Updating the position, judging whether to take hunting action or not by other individuals according to the current position information of the investigator, and if the current prey arresting success rate is high
Figure DEST_PATH_IMAGE052
Greater than minimum arrest rate
Figure DEST_PATH_IMAGE054
Then, the attacker, the obstacle, the driver and the chaser immediately take hunting action, the inspector continuously searches the next hunter, and updates the position information according to the mapped initialization population information and the position relationship between the inspector and other chimpanzee individuals, and the formula is as follows:
Figure DEST_PATH_IMAGE056
(11)
wherein,
Figure 311849DEST_PATH_IMAGE052
the current rate of arrest is shown as,
Figure 516566DEST_PATH_IMAGE054
the minimum arrest rate is expressed as the minimum arrest rate,
Figure DEST_PATH_IMAGE058
represents [0, 1]]The SND is a random number which obeys standard normal distribution;
step 6: after the positions of the investigators are updated, the rest individuals enter a search iteration stage, and random factors are updated firstly
Figure DEST_PATH_IMAGE060
Judging whether to enter global search or local search currently, if so, judging whether to enter global search or local search
Figure DEST_PATH_IMAGE062
Entering a global search stage; in the global search stage, an adaptive factor which is changed along with the iteration number is introduced
Figure DEST_PATH_IMAGE064
And updating the positions of the chimpanzee individuals in the population according to the factor change curve, wherein the formula is as follows:
Figure DEST_PATH_IMAGE066
(12)
Figure DEST_PATH_IMAGE068
(13)
Figure DEST_PATH_IMAGE070
(14)
Figure DEST_PATH_IMAGE072
(15)
Figure DEST_PATH_IMAGE074
(16)
Figure DEST_PATH_IMAGE076
(17)
Figure DEST_PATH_IMAGE078
(18)
step 7: if it is
Figure DEST_PATH_IMAGE080
Entering a local search stage, introducing a judgment logic of iteration times in the local search stage to judge whether the current iteration times t is less than the specified iteration times in order to prevent the local optimization possibly occurring in the early and late stages of the algorithm iteration
Figure DEST_PATH_IMAGE082
If yes, updating conversion parameters
Figure DEST_PATH_IMAGE084
Updating the position information of the attacker by using an improved sine and cosine algorithm; otherwise, the Gaussian variation is carried out on the position of the attacker,
after the sine and cosine algorithm is introduced, the position updating formula of the attacker in the population is as follows:
Figure DEST_PATH_IMAGE086
(19)
wherein,
Figure DEST_PATH_IMAGE088
indicating the position of the attacker in the t-th iteration,
Figure DEST_PATH_IMAGE090
is a random number,
Figure DEST_PATH_IMAGE092
Figure DEST_PATH_IMAGE094
Figure DEST_PATH_IMAGE096
Figure DEST_PATH_IMAGE098
Indicating the location of the currently optimal individual,
Figure 543164DEST_PATH_IMAGE084
for converting the parameters, the calculation formula is:
Figure DEST_PATH_IMAGE100
(20)
wherein,
Figure DEST_PATH_IMAGE102
Figure DEST_PATH_IMAGE104
respectively a maximum value and a minimum value of the conversion parameter,
Figure DEST_PATH_IMAGE106
in order to be the maximum number of iterations,
the mathematical model of gaussian variation for the attacker position is as follows:
Figure DEST_PATH_IMAGE108
(21)
wherein,
Figure DEST_PATH_IMAGE110
is a current individual chimpanzeeK is [0, 1]]The variable is decreased in the middle, and the variable is decreased,
Figure DEST_PATH_IMAGE112
is a gaussian distributed random vector with a mean value of 0 and a variance of 1;
step 8: calculating new fitness, optimal individual and position information according to the obtained new solution, checking whether the current algorithm meets the iteration termination condition, and if the maximum iteration times is reached
Figure 41403DEST_PATH_IMAGE106
Then the optimal position is terminated and output, otherwise Step4 is returned and re-executed.
6. The express customer churn prediction method based on value breakdown and integrated prediction according to claim 4, wherein: when the XGBoost model is constructed in the step (3), the timing sequence survey data of the client loss under various influence indexes is respectively used as the main characteristic input, the corresponding influence indexes influencing loss are used as labels, and the values of various parameters are optimized through a chimpanzee optimization algorithm model to minimize a target function, so as to establish an optimal tree structure model, which specifically comprises the following steps:
step 1: in the training stage of the BSGChOA _ XGboost model, initial parameters required by the prediction model are obtained through the optimization of a chimpanzee optimization algorithm,
step 2: selecting part of sample data in a criterion layer influencing an index set as a training set for model training and parameter optimization, calculating a fitness function value of the model, wherein the function value represents an optimal solution obtained by each operation of a chimpanzee optimization algorithm, the rest samples are used as a test set for carrying out final evaluation on the performance of the model, carrying out sampling prediction on the training set, then averaging the prediction results to obtain a final prediction result,
step 3: verifying the trained model by using the test set, evaluating whether the value of each parameter reaches the current optimal value according to the fitness function value of the prediction algorithm, and if so, replacing the original parameter; otherwise, continuing to keep the current parameters;
step 4: and (4) checking whether the algorithm meets an iteration termination condition, if the iteration termination condition reaches the maximum iteration times, terminating and outputting the optimal values of all parameters in the iteration process, and otherwise, returning and re-executing Step 2.
7. The express customer churn prediction method based on value breakdown and integrated prediction according to claim 1, wherein the single customer churn rate prediction module comprises the following steps:
step 1: generating various data characteristics of the client and constructing a client information system;
step 2: according to the information of the customer information system, a two-layer integrated learning algorithm is used for predicting the attrition rate of a single customer,
a first layer:
1) selecting n base classifiers and marking;
2) selecting feature attribute sets separately
Figure DEST_PATH_IMAGE114
Figure DEST_PATH_IMAGE116
Generating data sets D1 and D2, wherein the first m base classifiers respectively use the data sets D1 and D2 as input sets for training the training sample set to respectively obtain m prediction results
Figure DEST_PATH_IMAGE118
Figure DEST_PATH_IMAGE120
Deriving n-m predictions as input data for the remaining n-m classifiers
Figure 785500DEST_PATH_IMAGE118
3) Aiming at the training results of the n individual models, algorithm integration is carried out to construct n-dimensional feature vectors
Figure DEST_PATH_IMAGE122
A second layer:
1) the output feature vector P is used as the input of a linear classifier, and the weight of each type of model in the integrated model is learned by a gradient descent method
Figure DEST_PATH_IMAGE124
The final prediction result based on the above n base prediction models is obtained as
Figure DEST_PATH_IMAGE126
And then, by weighting the final probabilities of customer churn,
Figure DEST_PATH_IMAGE128
(22)
wherein,
Figure DEST_PATH_IMAGE130
representing the runoff prediction probability generated by the jth model,
Figure 956806DEST_PATH_IMAGE124
representing the weight assigned to the jth model.
8. The express customer churn prediction method based on value breakdown and integrated prediction as claimed in claim 7, wherein the data characteristics in Step1 are derived from customer base characteristics
Figure DEST_PATH_IMAGE132
Order characteristics
Figure DEST_PATH_IMAGE134
Customer-order interaction feature
Figure DEST_PATH_IMAGE136
And order liveness characteristics
Figure DEST_PATH_IMAGE138
Is prepared by the method (1).
9. The express delivery customer churn prediction method based on value breakdown and integrated prediction of claim 1, wherein the customer saving scheme design module comprises the following steps:
step 1: referring to the client loss rate and the client loss amount predicted by the prediction model, revisiting the client with the highest loss probability, and providing preferential activities of old user regression;
step 2: for queuing length
Figure DEST_PATH_IMAGE140
The influence index influencing the customer churn of the network points constructs a target constraint model of the customer churn and waiting time, the service intensity and the team length,
the objective function is:
Figure DEST_PATH_IMAGE142
(23)
the constraint conditions are as follows:
Figure DEST_PATH_IMAGE144
(24)
wherein,
Figure DEST_PATH_IMAGE146
in order to be the average intensity of service,
Figure DEST_PATH_IMAGE148
for the maximum service intensity that the service personnel can withstand,
Figure DEST_PATH_IMAGE150
is the average waiting time of the customer,
Figure DEST_PATH_IMAGE152
for the maximum residence time that the customer can endure,
Figure 273911DEST_PATH_IMAGE140
is the average team length.
CN202210236263.7A 2022-03-11 2022-03-11 Express customer loss prediction method based on value subdivision and integrated prediction Active CN115115389B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210236263.7A CN115115389B (en) 2022-03-11 2022-03-11 Express customer loss prediction method based on value subdivision and integrated prediction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210236263.7A CN115115389B (en) 2022-03-11 2022-03-11 Express customer loss prediction method based on value subdivision and integrated prediction

Publications (2)

Publication Number Publication Date
CN115115389A true CN115115389A (en) 2022-09-27
CN115115389B CN115115389B (en) 2024-07-23

Family

ID=83324727

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210236263.7A Active CN115115389B (en) 2022-03-11 2022-03-11 Express customer loss prediction method based on value subdivision and integrated prediction

Country Status (1)

Country Link
CN (1) CN115115389B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115828818A (en) * 2023-02-02 2023-03-21 湖北工业大学 Photovoltaic cell parameter identification method and storage medium
CN116203907A (en) * 2023-03-27 2023-06-02 淮阴工学院 Chemical process fault diagnosis alarm method and system
CN116627027A (en) * 2023-07-19 2023-08-22 济南大学 Optimal robustness control method based on improved PID
CN117408742A (en) * 2023-12-15 2024-01-16 湖南三湘银行股份有限公司 User screening method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109583651A (en) * 2018-12-03 2019-04-05 焦点科技股份有限公司 A kind of method and apparatus for insuring electric business platform user attrition prediction
CN109784966A (en) * 2018-11-29 2019-05-21 昆明理工大学 A kind of music website customer churn prediction method
CN112561598A (en) * 2020-12-23 2021-03-26 中国农业银行股份有限公司重庆市分行 Customer loss prediction and retrieval method and system based on customer portrait

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784966A (en) * 2018-11-29 2019-05-21 昆明理工大学 A kind of music website customer churn prediction method
CN109583651A (en) * 2018-12-03 2019-04-05 焦点科技股份有限公司 A kind of method and apparatus for insuring electric business platform user attrition prediction
CN112561598A (en) * 2020-12-23 2021-03-26 中国农业银行股份有限公司重庆市分行 Customer loss prediction and retrieval method and system based on customer portrait

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115828818A (en) * 2023-02-02 2023-03-21 湖北工业大学 Photovoltaic cell parameter identification method and storage medium
CN115828818B (en) * 2023-02-02 2023-05-16 湖北工业大学 Photovoltaic cell parameter identification method and storage medium
CN116203907A (en) * 2023-03-27 2023-06-02 淮阴工学院 Chemical process fault diagnosis alarm method and system
CN116203907B (en) * 2023-03-27 2023-10-20 淮阴工学院 Chemical process fault diagnosis alarm method and system
CN116627027A (en) * 2023-07-19 2023-08-22 济南大学 Optimal robustness control method based on improved PID
CN116627027B (en) * 2023-07-19 2024-01-30 济南大学 Optimal robustness control method based on improved PID
CN117408742A (en) * 2023-12-15 2024-01-16 湖南三湘银行股份有限公司 User screening method and system
CN117408742B (en) * 2023-12-15 2024-04-02 湖南三湘银行股份有限公司 User screening method and system

Also Published As

Publication number Publication date
CN115115389B (en) 2024-07-23

Similar Documents

Publication Publication Date Title
CN115115389B (en) Express customer loss prediction method based on value subdivision and integrated prediction
CN110363282B (en) Network node label active learning method and system based on graph convolution network
CN108921604B (en) Advertisement click rate prediction method based on cost-sensitive classifier integration
CN111105045A (en) Method for constructing prediction model based on improved locust optimization algorithm
CN110751289B (en) Online learning behavior analysis method based on Bagging-BP algorithm
US11914672B2 (en) Method of neural architecture search using continuous action reinforcement learning
WO2022068934A1 (en) Method of neural architecture search using continuous action reinforcement learning
WO2019205544A1 (en) Fairness-balanced result prediction classifier for context perceptual learning
CN112581264A (en) Grasshopper algorithm-based credit risk prediction method for optimizing MLP neural network
CN111061959A (en) Developer characteristic-based crowd-sourcing software task recommendation method
CN117787569B (en) Intelligent auxiliary bid evaluation method and system
CN113326976B (en) Port freight volume online prediction method and system based on time-space correlation
KR100895481B1 (en) Method for Region Based on Image Retrieval Using Multi-Class Support Vector Machine
CN111292062B (en) Network embedding-based crowd-sourced garbage worker detection method, system and storage medium
CN116977010A (en) Construction of service recommendation model, service recommendation method and device
CN115049006A (en) Communication signal identification method and system based on self-adaptive feedforward neural network
CN113850483A (en) Enterprise credit risk rating system
CN113191527A (en) Prediction method and device for population prediction based on prediction model
CN117952592B (en) Intelligent management method for charging pile
CN116228037B (en) Logistics management method and system based on knowledge base
US20240013058A1 (en) Information processing method, information processing apparatus, and non-transitory computer-readable storage medium
Gao et al. Adaptive decision method in C3I system
US20240013057A1 (en) Information processing method, information processing apparatus, and non-transitory computer-readable storage medium
JP7283548B2 (en) LEARNING APPARATUS, PREDICTION SYSTEM, METHOD AND PROGRAM
US20240012881A1 (en) Information processing method, information processing apparatus, and non-transitory computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant