CN107832581A

CN107832581A - Trend prediction method and device

Info

Publication number: CN107832581A
Application number: CN201711349699.2A
Authority: CN
Inventors: 胡瑞华
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Baidu Online Network Technology Beijing Co Ltd; Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2017-12-15
Filing date: 2017-12-15
Publication date: 2018-03-23
Anticipated expiration: 2037-12-15
Also published as: CN107832581B

Abstract

The present invention proposes a kind of trend prediction method and device, wherein, method includes：Targeted customer is sampled, sampling instant according to identifying has been lost in and has not been lost in the accounts information of user, negative sample, positive sample and checking sample are generated respectively, according to positive and negative samples, to for predicting, the decision-tree model of customer loss state is trained after sampling instant, by the decision-tree model after checking sample input training, to obtain predicting attrition status, if prediction attrition status calls rate together with the standard that actual attrition status is calculated is not less than threshold value, determine that decision-tree model training is completed, and carry out the prediction of customer loss state.The training sample generated by sampled targets user, decision-tree model is trained, the decision-tree model completed according to training is to customer loss status predication, solve in the prior art, pass through artificial experience or the progress customer loss status predication that lays down a regulation, cause recognition efficiency relatively low, and be not easy the technical problem being multiplexed.

Description

Trend prediction method and device

Technical field

The present invention relates to technical field of information processing, more particularly to a kind of trend prediction method and device.

Background technology

With the development of internet, there are substantial amounts of client with charge in many Internet firms, and customer group is fast year by year Incrementally grow, the after-sale service safeguarded to client is with regard to particularly important, and current after-sale service there are two features, first, visitor Family problem is various, complicated, in disorder, and customer service separation rate is high, and in face of this present situation, most of customer service is not known what to do how Do；Second, customer service resource-constrained, it is hundreds of that single customer service safeguards that account number just has, and the working strength of customer service is big, and work difficulty is big, and Service quality is difficult to ensure that.

Faced with this situation, filter out and problem be present, need the account safeguarded, and be handed down to work corresponding to client and appoint Business, it appears it is particularly important, still, in correlation technique, manual screening is carried out using special team is all to client daily, Related client's screening rule is either formulated, filters out the client for needing to safeguard, and corresponding to being handed down to by sending system Customer service, but this mode needs to rely on experienced specific crowd, and the surely accurate representation business demand that differs, or leak Screening, the possibility of wrong screening, the problem of causing recognition efficiency relatively low.

The content of the invention

It is contemplated that at least solves one of technical problem in correlation technique to a certain extent.

Therefore, first purpose of the present invention is to propose a kind of trend prediction method, given birth to by being sampled to targeted customer Into positive sample, checking sample and negative sample, decision-tree model is trained, the decision-tree model completed according to training is to adopting Customer loss status predication after the sample moment, solve in the prior art, used by artificial experience or lay down a regulation Family attrition status prediction, causes recognition efficiency relatively low, and be not easy the technical problem being multiplexed.

Second object of the present invention is to propose a kind of status predication device.

Third object of the present invention is to propose a kind of computer equipment.

Fourth object of the present invention is to propose a kind of non-transitorycomputer readable storage medium.

The 5th purpose of the present invention is to propose a kind of computer program product.

For the above-mentioned purpose, first aspect present invention embodiment proposes a kind of trend prediction method, including：

Targeted customer is sampled；

In the targeted customer obtained from sampling, identify and be lost in user in sampling instant and be not lost in user；Wherein, The user that the goal behavior is not carried out in the target duration before user is sampling instant is lost in, when not being lost in user to sample The user of the goal behavior was performed in target duration before quarter；

According to the accounts information that user has been lost in sampling instant, negative sample is generated, and be not lost in according in sampling instant The accounts information of user, generation positive sample and checking sample；

According to the negative sample and the positive sample, to for predicting the customer loss state after the sampling instant Decision-tree model is trained；

The checking sample is inputted into trained decision-tree model, to obtain predicting attrition status；

If it is not less than according to the standard rate of calling together that the prediction attrition status of the checking sample is calculated with actual attrition status Threshold value, determine that the decision-tree model training is completed；Wherein, actual attrition status, it is according to the institute after the sampling instant State whether user in target duration performed what the goal behavior determined；

The decision-tree model completed according to training, carry out the prediction of customer loss state.

In the trend prediction method of the embodiment of the present invention, targeted customer is sampled, according to the sampling instant identified The accounts information of user has been lost in and be not lost in, has generated negative sample, positive sample and checking sample respectively, it is right according to positive and negative samples For predicting, the decision-tree model of customer loss state is trained after sampling instant, after checking sample input training Decision-tree model, to obtain predicting attrition status, if prediction attrition status calls rate together not with the standard that actual attrition status is calculated Less than threshold value, determine that decision-tree model training is completed, and carry out the prediction of customer loss state.By sampling the sample of generation, Decision-tree model is trained, the decision-tree model completed according to training solves existing skill to customer loss status predication In art, by artificial experience or the progress customer loss status predication that lays down a regulation, cause recognition efficiency relatively low, and be not easy to be multiplexed Technical problem.

For the above-mentioned purpose, second aspect of the present invention embodiment proposes a kind of status predication device, including：

Sampling module, for being sampled to targeted customer；

Sample generation module, for from the obtained targeted customer of sampling, identify sampling instant be lost in user and User is not lost in；Wherein, user has been lost in be not carried out the user of the goal behavior in the target duration before sampling instant, User is not lost in perform the user of the goal behavior in the target duration before sampling instant；According in sampling instant The accounts information of user is lost in, generates negative sample, and according to the accounts information for not being lost in user in sampling instant, generate positive sample With checking sample；

Training module, for according to the negative sample and the positive sample, to for predicting after the sampling instant The decision-tree model of customer loss state is trained；

Correction verification module, for the checking sample to be inputted into trained decision-tree model, it is lost in shape to obtain prediction State；If the prediction attrition status of the checking sample is identical with actual attrition status, determine that the decision-tree model training is completed； Wherein, actual attrition status, it is whether the mesh was performed according to user in the target duration after the sampling instant What mark behavior determined；

Prediction module, for the decision-tree model completed according to training, carry out the prediction of customer loss state.

In the status predication device of the embodiment of the present invention, sampling module is used to sample targeted customer, sample generation Module is used for the accounts information that user has been lost in and be not lost according to the sampling instant identified, generates negative sample, positive sample respectively Originally and sample is verified, training module is used for according to positive and negative samples, to for predicting the customer loss state after sampling instant Decision-tree model is trained, and correction verification module is used to checking sample inputting trained decision-tree model, to be predicted Attrition status, if prediction attrition status calls rate together with the standard that actual attrition status is calculated is not less than threshold value, determine decision tree mould Type training is completed, and the decision-tree model that prediction module is used to be completed according to training carries out the prediction of customer loss state.By adopting The sample of sample generation, is trained to decision-tree model, the decision-tree model completed according to training to customer loss status predication, Solve in the prior art, by artificial experience or lay down a regulation carry out customer loss status predication, cause recognition efficiency compared with It is low, and it is not easy the technical problem being multiplexed.

For the above-mentioned purpose, third aspect present invention embodiment proposes a kind of computer equipment, including memory, processing Device and storage on a memory and the computer program that can run on a processor, during the computing device described program, reality Now trend prediction method as described in relation to the first aspect.

To achieve these goals, fourth aspect present invention embodiment proposes a kind of computer-readable storage of non-transitory Medium, computer program is stored thereon with, when the program is executed by processor, realizes status predication as described in relation to the first aspect Method.

To achieve these goals, fifth aspect present invention embodiment proposes a kind of computer program product, when described When instruction in computer program product is by computing device, trend prediction method as described in relation to the first aspect is realized.

The additional aspect of the present invention and advantage will be set forth in part in the description, and will partly become from the following description Obtain substantially, or recognized by the practice of the present invention.

Brief description of the drawings

Of the invention above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments Substantially and it is readily appreciated that, wherein：

A kind of schematic flow sheet for trend prediction method that Fig. 1 is provided by the embodiment of the present invention；

The schematic flow sheet for another trend prediction method that Fig. 2 is provided by the embodiment of the present invention；

A kind of structural representation for status predication device that Fig. 3 is provided by the embodiment of the present invention；

The structural representation for another status predication device that Fig. 4 is provided by the embodiment of the present invention；And

Fig. 5 shows the block diagram suitable for being used for the exemplary computer device for realizing the application embodiment.

Embodiment

Embodiments of the invention are described below in detail, the example of the embodiment is shown in the drawings, wherein from beginning to end Same or similar label represents same or similar element or the element with same or like function.Below with reference to attached The embodiment of figure description is exemplary, it is intended to for explaining the present invention, and is not considered as limiting the invention.

Below with reference to the accompanying drawings the trend prediction method and device of the embodiment of the present invention are described.

For the user account more than million, customer service safeguards that the working strength of account is larger, and account maintenance mainly has at present Following two schemes：

Scheme one, screening by hand, by special data analysis team, according to the needs of business scenario, download user account Achievement data, specific data analysis is carried out, filter out target account collection, task is issued to corresponding customer service.This method has two Individual shortcoming, when can only artificial screening account, the data analysis team experience leaned on, and the control of being familiar with to business, process be multiple It is miscellaneous, it is difficult to be multiplexed, and easily there is wrong data caused by neglecting；Second, the needs that issue of task perform daily, the sieve of account Choosing is also required to repeat daily, expends a large amount of manpowers.

Scheme two, Rulemaking：Accounts information is abstracted into various account indexs, there is provided flexible rule setting interface, By veteran customer service, according to the needs of business scenario, mission dispatching rule corresponding to setting, according to the index of rule setting Condition, the account aggregation for meeting rule condition is filtered out, task is issued to corresponding customer service account.This method has two to lack Point, first, the formulation of rule is complete by rule of thumb, include artificial uncertain factor；Second, rule setting limitation is very big, underaction, It is difficult to various index features are covered, and the relation between index, it is impossible to accurate representation business implication, so the account filtered out It is not necessarily all accurate.

To solve the above problems, the present invention proposes a kind of trend prediction method, sample and trained from target customer Sample, decision-tree model is trained according to training sample, by training the decision-tree model completed to carry out user account Prediction, the attrition status of client is predicted, realize according to user characteristics in sample, generate forecast model, automatic Prediction user's Attrition status, improve the efficiency of identification.

A kind of schematic flow sheet for trend prediction method that Fig. 1 is provided by the embodiment of the present invention.

As shown in figure 1, this method includes：

Step 101, targeted customer is sampled.

Alternatively, any active ues are identified, it is necessary to first from whole users before being sampled to targeted customer, and Any active ues are to perform the user of goal behavior, and goal behavior is included in buying behavior, navigation patterns and the behavior that continues to pay dues extremely It is few one.And then any active ues that will identify that are set as targeted customer, by sampling principle set in advance to targeted customer Sampled, for example, by the way of randomly selecting, a number of targeted customer is selected from any active ues of determination.

Step 102, in the targeted customer obtained from sampling, identify and be lost in user in sampling instant and be not lost in use Family.

Specifically, sample in obtained targeted customer, comprising being lost in user and not being lost in user, according to targeted customer's The execution moment of goal behavior in accounts information, and the target duration of setting, identify and be lost in user and not in sampling instant User is lost in, wherein, user has been lost in be not carried out the user of goal behavior in the target duration before sampling instant, has not been lost in User be sampling instant before target duration in performed the user of goal behavior.

Step 103, according to the accounts information that user has been lost in sampling instant, negative sample is generated, and according in sampling Carve the accounts information for not being lost in user, generation positive sample and checking sample.

Specifically, the accounts information of user has been lost according to sampling instant, has generated negative sample, and do not flow according to sampling instant The accounts information at appraxia family, generation positive sample and checking sample, wherein, positive sample and negative sample are used to be trained model, Verify that sample is used to carry out recruitment evaluation to the model for completing training.

Step 104, according to negative sample and positive sample, to the decision-making for predicting the customer loss state after sampling instant Tree-model is trained.

Specifically, feature extraction is carried out to negative sample and positive sample, obtains attributive character and behavioural characteristic, wherein, attribute Feature includes：Account effective status, account open an account and account subject of operation in it is at least one；Behavioural characteristic includes：Always disappear Take volume, total click volume, account balance and the number of days for consuming the distance samples moment recently.By the spy of the feature of negative sample and positive sample Levy input as decision-tree model, and the classification output result using attrition status or non-attrition status as decision tree, and adopt Rise (extreme gradient boosting, xgboost) algorithm pair with the extreme gradient based on decision tree in machine learning Decision-tree model performs training process.

Wherein, xgboost models, that is, tree-model is lifted, boosted tree is that more regression trees of iteration carry out Shared Decision Making, works as use During square error loss function, what each regression tree learnt is the conclusion and residual error of all trees before, and fitting obtains one and worked as Preceding residual error regression tree, the meaning such as formula of residual error：Residual error=actual value-predicted value, it is returning for whole iterative process generation Gui Shu's is cumulative.Traditional gradient lifting decision tree (Gradient Boosting Decision Tree, GBDT) using CART as Base grader, xgboost also support linear classifier, and xgboost is equivalent to the logistic regression with L1 and L2 regularization terms (classification problem) or linear regression (regression problem), meanwhile, xgboost has also carried out the second Taylor series to cost function, Single order and second dervative have been used simultaneously, regular terms has been added in cost function, the complexity for Controlling model.

Specifically, xgboost models new tree of increase by one when training every time, the output valve of each tree using plus and strategy, What fitting had succeeded in school.The training process of decision tree finds optimal decision tree, and finds optimal decision tree, is to pass through Target is established, i.e. object function finds optimal decision tree to realize, the algorithm principle of object function is as follows：

Model is according to given input x_i, i.e., the feature of positive and negative samples in the present embodiment, go to predict y_i, i.e., in the present embodiment Attrition status or non-attrition status, the anticipation function of linear model beBased on different y_iUnderstanding There can be different methods, for example return, classify, sequence etc., and be to classify in the present embodiment.Need to find out training data Best parameter, parameter typically refer to coefficient θ_j, in order to find out best parameter, it is necessary to define loss function to weigh model Performance, loss function include two parts：Training error function and regular terms, the formula of loss function are：Obj (Θ)=L (θ) + Ω (Θ), wherein, L (θ) is training error function, and Ω (θ) is regular terms, uses mean square error (Mean Squared Error, MSE) training error function is used as,

Single tree is not powerful enough in reality, it is desirable to be able to which more tree predicted values are added as last prediction result：K represents the number of tree, and f is a function in function space mathcal F, mathcal F It is all possible regression tree CART set.So the object function that we optimize (has namely added the loss letter after canonical Number) it can be write as：

In each step, it is thus necessary to determine which one tree needs to add an optimization aim, and then obtains object function and be：

Wherein, constant is constant.

Wherein, xgboost data format uses libsvm data formats, as follows：

[label][index1]:[value1][index2]:[value2]…

Numeric type type is only supported in the input of xgboost models, i.e., value therein can only be numerical value.In order to reduce mould The occupancy of internal memory when type training and model prediction, arithmetic speed when doing matrix inner products is improved, finally enter data format use Following form expression, such as：

1 101:1.2 102:0.03

2 1:2.1,10001:300 10002:400

···

Wherein, often row represents a sample, and the numeral of first row represents class label, represents the classification belonging to sample, ' 101 ' and ' 102 ' represent aspect indexing, and ' 1.2 ' and ' 0.03 ' are the values corresponding to feature.' 1 ' represents positive class in two classification, ' 0 ' represents negative class.

Further, in order to lift the performance of xgboost models, it is necessary to arameter optimization to xgboost models, conventional one As method it is as follows：

1) higher learning rate is selected, generally, the value of learning rate is 0.1, still, is asked for different Topic, preferable learning rate can sometimes fluctuate between 0.05 to 0.3, ideal decision-marking tree of the selection corresponding to this learning rate Quantity, xgboost have a very useful function " cv ", and this function can use cross validation in each iteration, and Return to preferable decision tree quantity.

2) for given learning rate and decision tree quantity, decision tree special parameter tuning, key parameter [value] are carried out It is as follows：

Iterative model booster [gbtree]：Refer to the model for selecting each iteration, gbtree is the model based on tree.

Positive and negative sample harmony parameter scale_pos_weight [1]：Customer revenue and non-attrition customer sample are very not Balance, it is one on the occasion of algorithm more rapid convergence can be made this parameter setting.

Iteration weight eta [0.1]：By reducing the weight of each step, the robustness of model can be improved, representative value is 0.01-0.2。

Minimum optimization loss function objective [binary:logistic]：Definition needs loss letter to be minimized Number, the final prediction result of the present invention is customer churn probability, so selection binary:The logistic regression that logistic bis- classifies, Return to the probability (not being classification) of prediction.

Judge the metric parameter eval_metric [rmse, error, auc] of valid data：To the measurement side of valid data Method, rmse root-mean-square errors, mae mean absolute errors, auc TG-AUCs.

The depth capacity max_depth [5] of tree：The depth capacity of tree, for avoiding model over-fitting problem, value is bigger, Model can acquire more specific more local sample.

Iterations num-boost_round：Increase iterations, be to solve over-fitting serious problems.

3) tuning (lambda, alpha) of xgboost regularization parameter, these parameters can reduce the complexity of model Degree, so as to improve the performance of model.

4) learning rate is reduced, determines ideal parameters.

By the parameter of the tuning model, the degree of accuracy and the efficiency of model prediction can be improved.

Step 105, checking sample is inputted into trained decision-tree model, to obtain predicting attrition status.

Specifically, predetermined checking sample is inputted into trained decision-tree model, the loss shape predicted State.

Step 106, if calling rate together not with the standard that actual attrition status is calculated according to the prediction attrition status of checking sample Less than threshold value, determine that decision-tree model training is completed.

Specifically, by according to the prediction attrition status obtained in step 105, standard is calculated with actual attrition status and calls together Rate, call the standard together rate and predetermined threshold value compares, if standard, which calls rate together, is not less than threshold value, it is determined that model training is completed during decision-making.Wherein, Actual attrition status, it is whether to perform what goal behavior determined according to user in the target duration after sampling instant.

Step 107, the decision-tree model completed according to training, the prediction of customer loss state is carried out.

Specifically, targeted customer is inputted to the decision-tree model of training completion, the customer loss state of targeted customer is entered Row prediction, if targeted customer predicts that attrition status is to be lost in, the task for indicating to safeguard targeted customer is generated, is appointed After business issues, the actual attrition status of targeted customer determined during tasks carrying is obtained, is lost in shape according to targeted customer is actual State and the prediction attrition status of targeted customer, determine the accuracy of decision-tree model.

In the trend prediction method of the embodiment of the present invention, targeted customer is sampled, according to the sampling instant identified The accounts information of user has been lost in and be not lost in, has generated negative sample, positive sample and checking sample respectively, it is right according to positive and negative samples For predicting, the decision-tree model of customer loss state is trained after sampling instant, after checking sample input training Decision-tree model, to obtain predicting attrition status, if prediction attrition status calls rate together not with the standard that actual attrition status is calculated Less than threshold value, determine that decision-tree model training is completed, and carry out the prediction of customer loss state.By sampling the sample of generation, Decision-tree model is trained, the decision-tree model completed according to training solves existing skill to customer loss status predication In art, by artificial experience or the progress customer loss status predication that lays down a regulation, cause to expend more manpower, while predict knot Fruit is inaccurate, and the technical problem that can not be multiplexed.

Based on above-described embodiment, the invention also provides a kind of possible implementation of trend prediction method, further Clear interpretation, how the process verified and optimized to decision-tree model, Fig. 2 is provided another by the embodiment of the present invention The schematic flow sheet of kind trend prediction method, as shown in Fig. 2 this method includes：

Step 201, the targeted customer that any active ues are identified as in whole users is sampled.

Specifically, any active ues are identified from whole users, the targeted customer of any active ues is sampled, for example, Sampling time is on June 30th, 2017, and 200 users are randomly selected from targeted customer, the targeted customer obtained as sampling.

Step 202, in the targeted customer obtained from sampling, identify and be lost in user in sampling instant and be not lost in use Family.

Specifically, identified in targeted customer and be lost in user and be not lost in user, be according to before sampling instant Target duration in goal behavior whether was performed to judge, goal behavior includes buying behavior, navigation patterns and the row that continues to pay dues It is at least one in, for example, sampling instant is on June 30th, 2017, a length of 45 days during the target of selection, then it has been lost in user To be not carried out the user of goal behavior in 45 days before 30 days June in 2017 of sampling instant, and be not lost in user be The user of goal behavior was performed in 45 days before 30 days June in 2017 of sampling instant.

Wherein, target duration can be 30 days, 45 days or be 75 days, when target duration is set as 30 days, if with 30 Goal behavior was not carried out in it and is then directly determined as customer revenue, easily causes to judge by accident, because the time intervals of 30 days are short, 30 It may not necessarily be lost in without consumption in it；And target duration is set as 75 days, although can substantially find to be not carried out target line for a long time For user, but 75 days time interval again it is oversize, be unfavorable for pinpointing the problems as early as possible；And with 45 days for target duration, it is more excellent , you can to pinpoint the problems as early as possible, the problem of erroneous judgement can be reduced again, thus when in the present embodiment with target a length of 45 days to be adapted to Target duration.Those skilled in the art can also be according to the actual conditions of targeted customer, separately sets target duration, the present embodiment In do not limit.

Step 203, according to the accounts information that user has been lost in sampling instant, negative sample is generated, and according in sampling Carve the accounts information for not being lost in user, generation positive sample and checking sample.

Step 204, according to negative sample and positive sample, to the decision-making for predicting the customer loss state after sampling instant Tree-model is trained.

Step 203 and step 204, step 103 and step 104 in an embodiment are can refer to, here is omitted.

Step 205, checking sample is inputted into trained decision-tree model, to obtain predicting attrition status.

For example, the judgement that sampling instant is got on June 30th, 2017 is the checking sample of non-attrition status, sample is verified This number is 40, in forecast date on October 30th, 2017, after inputting trained decision-tree model, is predicted It is 10 to be lost in user, is not lost in user for 30.

Step 206, judge to predict that attrition status calls whether rate is more than threshold value together with the standard that actual attrition status is calculated, If so, step 207 is then performed, if it is not, performing step 208.

Wherein, the standard rate of calling together refers to accuracy rate and recall rate, and for the quality of evaluation result, accuracy rate refers in the present embodiment How many is accurate in the user for being predicted as being lost in, and how many is predicted out the user that recall rate refers to be lost in, accurately Rate and recall rate value are between 0 to 1, and numerical value is closer to 1, and accuracy rate and recall rate are higher, as a kind of possible realization Mode, the standard rate of calling together in the present embodiment can be the harmonic-mean F values being calculated according to accuracy rate and recall rate, F=2* standards True rate * recall rate/(accuracys rate+recall rate), the effect of decision model training is compared by F values and predetermined threshold value.Specifically, will Verify that targeted customer is after decision-tree model corresponding to sample, the prediction result of acquisition, respectively with it is real in actual conditions Attrition status compares one by one, determines whether the prediction result of each checking sample is accurate respectively.For example, checking sample is 40 Individual, after inputting trained decision-tree model, the user of loss predicted is 15, is not lost in user for 25, and Actual conditions are that 40 users corresponding to checking sample, actually have 10 users to be lost in, 30 users are not lost in, will predict As a result compared with actual result, determine whether each prediction attrition status is correct, so as to calculate respectively accurately checking sample Rate and recall rate, and then F values are calculated according to accuracy rate and recall rate, by F values compared with predetermined threshold value, if F values are more than in advance If threshold value, then illustrate that training effect is preferable, determine decision-tree model training complete, if F values are not more than predetermined threshold value, need into One step judges, is to readjust target duration, carries out model re -training, still only adjusts the parameter of decision-tree model, carries out The re -training of model.

Step 207, determine that decision-tree model training is completed, and the pre- of customer loss state is carried out according to the decision-tree model Survey.

Specifically, targeted customer is inputted to the decision-tree model of training completion, the customer loss state of targeted customer is entered Row prediction, if targeted customer predicts that attrition status is to be lost in, the task for indicating to safeguard targeted customer is generated, is made For a kind of possible implementation, contact staff is according to the maintenance task received, and a reason for point family allowable may be lost in, use is excellent The modes such as favour certificate, stimulate client to continue to pay dues, consume, reach the purpose for preventing customer churn, after mission dispatching, obtain tasks carrying mistake The actual attrition status of targeted customer determined in journey, shape is lost according to the prediction of the actual attrition status of targeted customer and targeted customer State, determine the accuracy of decision-tree model.

Alternatively, it is predicted according to the decision tree of determination, can also be in terms of client, in terms of customer service for the result of prediction Carry out assessment prediction effect with unit of operation 3 aspect, the index assessed in terms of client is：The change of client health degree, account refer to Target changes and the change of client feedback；The index of assessment is in terms of customer service：The year-on-year amplification of unit interval turnover rate；Operation is single Position evaluation index be：The year-on-year amplification and control group turnover rate of unit interval turnover rate.Pass through the assessment knot of above-mentioned three aspects Fruit, the training and optimization of model are constantly carried out, improve the accuracy rate of model prediction, improve the accuracy of mission dispatching.

Step 208, the value of target duration, and return to step 202 are adjusted.

Specifically, if judging to obtain predicting that attrition status calls rate F values together with the standard that actual attrition status is calculated and is less than threshold Value, then the effect of model training is undesirable, it is necessary to adjust the value of target duration, return to step 202, re-recognizes sampling The user of loss at moment and user is not lost in, according to the accounts information for being lost in user re-recognized, regenerate negative sample This, and according to the accounts information for not being lost in user re-recognized, regenerates positive sample and checking sample, and according to giving birth to again Into negative sample and the positive sample that regenerates, re -training is carried out to decision-tree model, realizes the optimization of decision-tree model.

It should be noted that as alternatively possible implementation, decision tree mould can also be adjusted after step 208 The parameter of type, or, determine that accurate call together performs the parameter for adjusting decision-tree model when rate is not more than threshold value in step 206, and return Step 204.

Specifically, if the prediction attrition status of checking sample is identical with actual attrition status, the mould of decision-tree model is adjusted Shape parameter, wherein, if decision-tree model is the xgboost decision trees risen based on gradient, model parameter includes minimum optimization and damaged Lose function, positive and negative sample harmony parameter, iteration weight, iterations, tree depth capacity and for judging valid data It is at least one in metric parameter, and return to step 204, according to the positive sample and negative sample of determination, to determining after parameter adjustment Plan tree-model carries out re -training, realizes the optimization of decision-tree model.

It should be noted that explanation and definition for model parameter, can refer to the solution in step 104 in an embodiment Release, here is omitted.

In order to realize above-described embodiment, the present invention also proposes a kind of status predication device.

A kind of structural representation for status predication device that Fig. 3 is provided by the embodiment of the present invention.

As shown in figure 3, the device includes：Sampling module 31, sample generation module 32, training module 33, correction verification module 34 With prediction module 35.

Sampling module 31, for being sampled to targeted customer.

Sample generation module 32, for from sampling in obtained targeted customer, identifying and being lost in user in sampling instant User is not lost in, wherein, user has been lost in be not carried out the user of goal behavior in the target duration before sampling instant, not It is lost in the user that goal behavior was performed in the target duration before user is sampling instant.Use has been lost in according in sampling instant The accounts information at family, negative sample is generated, and according to the accounts information for not being lost in user in sampling instant, generate positive sample and checking Sample.

Training module 33, for according to negative sample and positive sample, to for predicting the customer loss shape after sampling instant The decision-tree model of state is trained.

Correction verification module 34, for checking sample to be inputted into trained decision-tree model, to obtain predicting attrition status, If verifying, the prediction attrition status of sample is identical with actual attrition status, determines that decision-tree model training is completed, wherein, actual stream Mistake state, it is whether to perform what goal behavior determined according to user in the target duration after sampling instant.

Prediction module 35, for the decision-tree model completed according to training, carry out the prediction of customer loss state.

It should be noted that the foregoing explanation to embodiment of the method is also applied for the device of the embodiment, herein not Repeat again.

In the status predication device of the embodiment of the present invention, sampling module is used to sample targeted customer, sample generation Module is used for the accounts information that user has been lost in and be not lost according to the sampling instant identified, generates negative sample, positive sample respectively Originally and sample is verified, training module is used for according to positive and negative samples, to for predicting the customer loss state after sampling instant Decision-tree model is trained, and correction verification module is used to checking sample inputting trained decision-tree model, to be predicted Attrition status, if prediction attrition status calls rate together with the standard that actual attrition status is calculated is not less than threshold value, determine decision tree mould Type training is completed, and the decision-tree model that prediction module is used to be completed according to training carries out the prediction of customer loss state.By adopting The sample of sample generation, is trained to decision-tree model, the decision-tree model completed according to training to customer loss status predication, Solve in the prior art, by artificial experience or the progress customer loss status predication that lays down a regulation, cause to expend more people Power, while prediction result is inaccurate, and the technical problem that can not be multiplexed.

Based on above-described embodiment, the embodiment of the present invention additionally provides a kind of possible implementation of status predication device, The structural representation for another status predication device that Fig. 4 is provided by the embodiment of the present invention, on the basis of a upper embodiment, The device also includes：First adjusting module 36, the second adjusting module 37, subscriber identification module 38, generation module 39, acquisition module 40 and determining module 41.

First adjusting module 36, if for being calculated according to the prediction attrition status of checking sample with actual attrition status Standard call rate together and be less than threshold value, adjust the value of target duration.

As a kind of possible implementation, sample generation module 32, can be also used for according to the target duration after adjustment, Re-recognize the user of loss of sampling instant and be not lost in user, according to the accounts information for being lost in user re-recognized, Negative sample is regenerated, and according to the accounts information for not being lost in user re-recognized, regenerates positive sample and checking sample.

Second adjusting module 37, if the prediction attrition status for the checking sample is identical with actual attrition status, adjust The model parameter of whole decision-tree model, wherein, if decision-tree model is the xgboost risen based on gradient decision tree, model Parameter include minimum optimization loss function, positive and negative sample harmony parameter, iteration weight, iterations, number depth capacity and It is at least one in metric parameter for judging valid data.

As a kind of possible implementation, training module 33 can be also used for according to the negative sample regenerated and again Newly-generated positive sample, re -training is carried out to decision-tree model.

Subscriber identification module 38, from whole users, any active ues are identified, wherein, any active ues are to perform target The user of behavior, goal behavior include at least one in buying behavior, navigation patterns and the behavior that continues to pay dues.

Generation module 39, if predicting that attrition status is to be lost in for targeted customer, generate for indicating to targeted customer Being safeguarded for task.

Acquisition module 40, after mission dispatching, obtain during tasks carrying that the targeted customer that determines is actual to be lost in shape State.

Determining module 41, for the prediction attrition status according to the actual attrition status of targeted customer and targeted customer, it is determined that The accuracy of decision-tree model.

As a kind of possible implementation, training module 33, can also include：Extraction unit 331 and training unit 332。

Extraction unit 331, for carrying out feature extraction to negative sample and positive sample.

Training unit 332, for the input using the feature of the feature of negative sample and positive sample as decision-tree model, and will The classification output result of attrition status or non-attrition status as decision tree, using xgboost algorithms, decision-tree model is performed Training process.

Wherein, extraction unit 331, can also be specifically used for：Feature extraction is carried out to negative sample and positive sample, belonged to Property feature and behavioural characteristic, wherein, attributive character includes：Account effective status, account open an account and account subject of operation in It is at least one.Behavioural characteristic includes：Overall consumption volume, total click volume, account balance, the number of days for consuming the distance samples moment recently.

In order to realize above-described embodiment, the present invention also proposes a kind of computer equipment, including memory, processor and storage On a memory and the computer program that can run on a processor, during the computing device described program, realize as foregoing Trend prediction method described in embodiment of the method.

In order to realize above-described embodiment, the present invention also proposes a kind of non-transitorycomputer readable storage medium, deposited thereon Computer program is contained, when program is executed by processor, realizes the trend prediction method as described in preceding method embodiment.

In order to realize above-described embodiment, the present invention also proposes a kind of computer program product, when in computer program product Instruction by computing device when, realize trend prediction method as described in preceding method embodiment.

Fig. 5 shows the block diagram suitable for being used for the exemplary computer device for realizing the application embodiment.What Fig. 5 was shown Computer equipment 12 is only an example, should not bring any restrictions to the function and use range of the embodiment of the present application.

As shown in figure 5, computer equipment 12 is showed in the form of universal computing device.The component of computer equipment 12 can be with Including but not limited to：One or more processor or processing unit 16, system storage 28, connect different system component The bus 18 of (including system storage 28 and processing unit 16).

Bus 18 represents the one or more in a few class bus structures, including memory bus or Memory Controller, Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.Lift For example, these architectures include but is not limited to industry standard architecture (Industry Standard Architecture；Hereinafter referred to as：ISA) bus, MCA (Micro Channel Architecture；Below Referred to as：MAC) bus, enhanced isa bus, VESA (Video Electronics Standards Association；Hereinafter referred to as：VESA) local bus and periphery component interconnection (Peripheral Component Interconnection；Hereinafter referred to as：PCI) bus.

Computer equipment 12 typically comprises various computing systems computer-readable recording medium.These media can be it is any can be by The usable medium that computer equipment 12 accesses, including volatibility and non-volatile media, moveable and immovable medium.

Memory 28 can include the computer system readable media of form of volatile memory, such as random access memory Device (Random Access Memory；Hereinafter referred to as：RAM) 30 and/or cache memory 32.Computer equipment 12 can be with Further comprise other removable/nonremovable, volatile/non-volatile computer system storage mediums.Only as an example, Storage system 34 can be used for reading and writing immovable, non-volatile magnetic media, and (Fig. 5 do not show, commonly referred to as " hard drive Device ").Although not shown in Fig. 5, it can provide for being driven to the disk that may move non-volatile magnetic disk (such as " floppy disk ") read-write Dynamic device, and to removable anonvolatile optical disk (such as：Compact disc read-only memory (Compact Disc Read Only Memory；Hereinafter referred to as：CD-ROM), digital multi read-only optical disc (Digital Video Disc Read Only Memory；Hereinafter referred to as：DVD-ROM) or other optical mediums) read-write CD drive.In these cases, each driving Device can be connected by one or more data media interfaces with bus 18.Memory 28 can include at least one program and produce Product, the program product have one group of (for example, at least one) program module, and it is each that these program modules are configured to perform the application The function of embodiment.

Program/utility 40 with one group of (at least one) program module 42, such as memory 28 can be stored in In, such program module 42 include but is not limited to operating system, one or more application program, other program modules and Routine data, the realization of network environment may be included in each or certain combination in these examples.Program module 42 is usual Perform the function and/or method in embodiments described herein.

Computer equipment 12 can also be with one or more external equipments 14 (such as keyboard, sensing equipment, display 24 Deng) communication, the equipment communication interacted with the computer equipment 12 can be also enabled a user to one or more, and/or with making Obtain any equipment that the computer equipment 12 can be communicated with one or more of the other computing device (such as network interface card, modulatedemodulate Adjust device etc.) communication.This communication can be carried out by input/output (I/O) interface 22.Also, computer equipment 12 may be used also To pass through network adapter 20 and one or more network (such as LAN (Local Area Network；Hereinafter referred to as： LAN), wide area network (Wide Area Network；Hereinafter referred to as：WAN) and/or public network, for example, internet) communication.Such as figure Shown, network adapter 20 is communicated by bus 18 with other modules of computer equipment 12.It should be understood that although do not show in figure Go out, computer equipment 12 can be combined and use other hardware and/or software module, included but is not limited to：Microcode, device drives Device, redundant processing unit, external disk drive array, RAID system, tape drive and data backup storage system etc..

Processing unit 16 is stored in program in system storage 28 by operation, so as to perform various function application and Data processing, such as realize the method referred in previous embodiment.

In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or the spy for combining the embodiment or example description Point is contained at least one embodiment or example of the present invention.In this manual, to the schematic representation of above-mentioned term not Identical embodiment or example must be directed to.Moreover, specific features, structure, material or the feature of description can be with office Combined in an appropriate manner in one or more embodiments or example.In addition, in the case of not conflicting, the skill of this area Art personnel can be tied the different embodiments or example and the feature of different embodiments or example described in this specification Close and combine.

In addition, term " first ", " second " are only used for describing purpose, and it is not intended that instruction or hint relative importance Or the implicit quantity for indicating indicated technical characteristic.Thus, define " first ", the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the invention, " multiple " are meant that at least two, such as two, three It is individual etc., unless otherwise specifically defined.

Any process or method described otherwise above description in flow chart or herein is construed as, and represents to include Module, fragment or the portion of the code of the executable instruction of one or more the step of being used to realize custom logic function or process Point, and the scope of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discuss suitable Sequence, including according to involved function by it is basic simultaneously in the way of or in the opposite order, carry out perform function, this should be of the invention Embodiment person of ordinary skill in the field understood.

Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system including the system of processor or other can be held from instruction The system of row system, device or equipment instruction fetch and execute instruction) use, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium " can any can be included, store, communicate, propagate or pass Defeated program is for instruction execution system, device or equipment or the dress used with reference to these instruction execution systems, device or equipment Put.The more specifically example (non-exhaustive list) of computer-readable medium includes following：Electricity with one or more wiring Connecting portion (electronic installation), portable computer diskette box (magnetic device), random access memory (RAM), read-only storage (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device, and portable optic disk is read-only deposits Reservoir (CDROM).In addition, computer-readable medium, which can even is that, to print the paper of described program thereon or other are suitable Medium, because can then enter edlin, interpretation or if necessary with it for example by carrying out optical scanner to paper or other media His suitable method is handled electronically to obtain described program, is then stored in computer storage.

It should be appreciated that each several part of the present invention can be realized with hardware, software, firmware or combinations thereof.Above-mentioned In embodiment, software that multiple steps or method can be performed in memory and by suitable instruction execution system with storage Or firmware is realized.Such as, if realized with hardware with another embodiment, following skill well known in the art can be used Any one of art or their combination are realized：With the logic gates for realizing logic function to data-signal from Logic circuit is dissipated, the application specific integrated circuit with suitable combinational logic gate circuit, programmable gate array (PGA), scene can compile Journey gate array (FPGA) etc..

Those skilled in the art are appreciated that to realize all or part of step that above-described embodiment method carries Suddenly it is that by program the hardware of correlation can be instructed to complete, described program can be stored in a kind of computer-readable storage medium In matter, the program upon execution, including one or a combination set of the step of embodiment of the method.

In addition, each functional unit in each embodiment of the present invention can be integrated in a processing module, can also That unit is individually physically present, can also two or more units be integrated in a module.Above-mentioned integrated mould Block can both be realized in the form of hardware, can also be realized in the form of software function module.The integrated module is such as Fruit is realized in the form of software function module and as independent production marketing or in use, can also be stored in a computer In read/write memory medium.

Storage medium mentioned above can be read-only storage, disk or CD etc..Although have been shown and retouch above Embodiments of the invention are stated, it is to be understood that above-described embodiment is exemplary, it is impossible to be interpreted as the limit to the present invention System, one of ordinary skill in the art can be changed to above-described embodiment, change, replace and become within the scope of the invention Type.

Claims

1. a kind of trend prediction method, it is characterised in that comprise the following steps：

Targeted customer is sampled；

In the targeted customer obtained from sampling, identify and be lost in user in sampling instant and be not lost in user；Wherein, it has been lost in User be sampling instant before target duration in be not carried out the user of the goal behavior, be not lost in user for sampling instant it The user of the goal behavior was performed in preceding target duration；

According to the accounts information that user has been lost in sampling instant, negative sample is generated, and user is not lost in according in sampling instant Accounts information, generation positive sample and checking sample；

According to the negative sample and the positive sample, to the decision-making for predicting the customer loss state after the sampling instant Tree-model is trained；

If calling rate together with the standard that actual attrition status is calculated according to the prediction attrition status of the checking sample is not less than threshold value, Determine that the decision-tree model training is completed；Wherein, actual attrition status, it is according to the target after the sampling instant Whether user performed what the goal behavior determined in duration；

2. trend prediction method according to claim 1, it is characterised in that described that instruction is passed through into the checking sample input Experienced decision-tree model, to obtain after predicting attrition status, in addition to：

If calling rate together with the standard that actual attrition status is calculated according to the prediction attrition status of the checking sample is less than threshold value, adjust The value of the whole target duration；

According to the target duration after adjustment, re-recognize the user of loss of sampling instant and be not lost in user；

According to the accounts information for being lost in user re-recognized, negative sample is regenerated, and be not lost according to what is re-recognized The accounts information of user, regenerate positive sample and checking sample；

According to the negative sample regenerated and the positive sample regenerated, re -training is carried out to the decision-tree model.

3. trend prediction method according to claim 1, it is characterised in that described that instruction is passed through into the checking sample input Experienced decision-tree model, to obtain after predicting attrition status, in addition to：

If the prediction attrition status of the checking sample is identical with actual attrition status, the model ginseng of the decision-tree model is adjusted Number；Wherein, if the decision-tree model is the xgboost risen based on gradient decision tree, the model parameter includes minimum Optimization loss function, positive and negative sample harmony parameter, iteration weight, iterations, number depth capacity and for judge effectively It is at least one in the metric parameter of data.

According to the negative sample and the positive sample, re -training is carried out to the decision-tree model.

4. according to the trend prediction method described in claim any one of 1-3, it is characterised in that the targeted customer uses to be active Family；It is described targeted customer is sampled before, in addition to：

From whole users, any active ues are identified；Wherein, any active ues are to perform the user of goal behavior；It is described Goal behavior includes at least one in buying behavior, navigation patterns and the behavior that continues to pay dues.

5. according to the trend prediction method described in claim any one of 1-3, it is characterised in that it is described according to the negative sample and The positive sample, to for predicting, the decision-tree model of customer loss state is trained after the sampling instant, including：

Feature extraction is carried out to the negative sample and the positive sample；

Input using the feature of the feature of the negative sample and the positive sample as the decision-tree model, and by attrition status Or classification output result of the non-attrition status as the decision tree, using xgboost algorithms, the decision-tree model is performed Training process.

6. trend prediction method according to claim 5, it is characterised in that described to the negative sample and the positive sample Feature extraction is carried out, including：

Feature extraction is carried out to the negative sample and the positive sample, obtains attributive character and behavioural characteristic；Wherein, the category Property feature includes：Account effective status, account open an account and account subject of operation in it is at least one；The behavioural characteristic bag Include：Overall consumption volume, total click volume, recently account balance, number of days of the consumption apart from the sampling instant.

7. according to the trend prediction method described in claim any one of 1-3, it is characterised in that the institute completed according to training Decision-tree model is stated, after the prediction for carrying out customer loss state, in addition to：

If targeted customer predicts that attrition status is to be lost in, the task for indicating to safeguard the targeted customer is generated；

After mission dispatching, the actual attrition status of the targeted customer determined during tasks carrying is obtained；

According to the actual attrition status of the targeted customer and the prediction attrition status of the targeted customer, the decision tree mould is determined The accuracy of type.

A kind of 8. status predication device, it is characterised in that including：

Sampling module, for being sampled to targeted customer；

Sample generation module, for from sampling in obtained targeted customer, identifying and being lost in user in sampling instant and do not flowed Appraxia family；Wherein, user has been lost in be not carried out the user of the goal behavior in the target duration before sampling instant, has not been flowed Appraxia family be sampling instant before target duration in performed the user of the goal behavior；It has been lost in according in sampling instant The accounts information of user, negative sample is generated, and according to the accounts information for not being lost in user in sampling instant, generate positive sample and test Demonstrate,prove sample；

Training module, for according to the negative sample and the positive sample, to for predicting the user after the sampling instant The decision-tree model of attrition status is trained；

Correction verification module, for the checking sample to be inputted into trained decision-tree model, to obtain predicting attrition status；If The prediction attrition status of the checking sample is identical with actual attrition status, determines that the decision-tree model training is completed；Wherein, Actual attrition status, it is whether the goal behavior was performed according to user in the target duration after the sampling instant Determine；

9. a kind of computer equipment, it is characterised in that including memory, processor and storage on a memory and can be in processor The computer program of upper operation, during the computing device described program, realize the state as described in any in claim 1-7 Forecasting Methodology.

10. a kind of non-transitorycomputer readable storage medium, is stored thereon with computer program, it is characterised in that the program The trend prediction method as described in any in claim 1-7 is realized when being executed by processor.

11. a kind of computer program product, it is characterised in that when the instruction in the computer program product is by computing device When, perform the trend prediction method as described in any in claim 1-7.