CN107609700A

CN107609700A - A kind of customer value model optimization method based on machine learning

Info

Publication number: CN107609700A
Application number: CN201710807555.0A
Authority: CN
Inventors: 李星龙; 李伟; 汤紫瑜
Original assignee: Uuua Information Technology (suzhou) Co Ltd
Current assignee: Uuua Information Technology (suzhou) Co Ltd
Priority date: 2017-09-08
Filing date: 2017-09-08
Publication date: 2018-01-19

Abstract

The present invention relates to a kind of customer value model optimization method based on machine learning, including the steps：Step 1：The customer value model data of N number of client's main body different times is extracted by stochastical sampling method, obtains initial model data sample Si (i=1,2,3...N)；Step 2：To using bagging machine learning methods respectively to individual initial model data sample Si (i=1,2,3...n), N number of independent individual weak learner Hi (i=1,2,3...N) is accordingly trained；Step 3：Described individual weak learner Hi (i=1,2,3...N) is combined into by learner H one strong by stacking combinations strategy；Step 4：Using strong learner H as optimal models rule, and existing customer value models data sample is input to strong learner H, the result that strong learner H is drawn is optimal result model.

Description

A kind of customer value model optimization method based on machine learning

Technical field

The present invention relates to a kind of processing method of transaction data, more particularly to a kind of customer value mould based on machine learning Type optimization method.

Background technology

At present, traditional model optimization mode, verified using Experimental comparison.For target identification class model, root According to needing Optimized model application scenarios, comparative selection data, a part of target data and other interference data are included in data.Will Test data imports model running, checks the identification quantity of target data in model output result, carries out modelling effect judgement.Mould Type effect is judged mainly by the way that the recall ratio and precision ratio of target data, two indices are weighed：

Recall ratio, refer in model calculation result, comprising target data number of samples, account for target data sample in detection data This percentage.

Precision ratio, refer in model calculation result, comprising target data number of samples, account for whole Model Identification number of samples Percentage.

Class model is predicted for index, it is same to select historical data to import model, according to model calculation result and actual number According to being compared, calculation error scope, if error range meets model accuracy design requirement, model need not optimize；If by mistake Poor scope then needs to carry out model optimization more than model accuracy requirement.

The optimization process of the same model of sector application at present, it is consistent, it is necessary to re-start substantially with the newly-built process of model Mode input data are associated analysis, import new data field and replace legacy data information.Then it is root in terms of model algorithm According to optimization at that time, overall social base algorithm research present situation, more preferable algorithm is selected to substitute original algorithm.

Passing through the above-mentioned explanation optimized to current business models, it can be seen that the mode of existing model optimization is more traditional, Labor intensive, time cost are higher, less efficient.Existing sector application model optimization simultaneously, it is necessary under experimental conditions could Complete, real-time optimization can not be carried out under real running environment automatically, delay practical, commercial, if model application is some The core mechanism of enterprise, model optimization process, also larger interests can be brought to lose to enterprise.Therefore, majority is also actually caused Enterprise, it is reluctant to spend so high cost to carry out model optimization, still continues to use old model, equally also have impact on the actual effect of model Fruit.

The content of the invention

In order to solve the above technical problems, it is an object of the invention to provide a kind of customer value model based on machine learning is excellent Change method, the customer value model optimization method can reduce manpower, time cost, improve data-optimized efficiency, while also protect The effect of model of a syndrome application, improves utilization benefit.

A kind of customer value model optimization method based on machine learning of the present invention, its feature are that this method includes The steps：

Step 1：The customer value model data of N number of client's main body different times is extracted by stochastical sampling method, is obtained just Beginning model data sample Si (i=1,2,3...N)；

Step 2：Bagging machine learning sides are used respectively to each initial model data sample Si (i=1,2,3...n) Method, accordingly train N number of independent individual weak learner Hi (i=1,2,3...N)；

Step 3：Strategy is combined by the individual weak learner Hi (i=1,2,3...N) described in step 2 by stacking It is combined into learner H one strong；

Step 4：Using the strong learner H that step 3 obtains as optimal models rule, and by existing customer value models data Sample is input to strong learner H, and the result that strong learner H is drawn is optimal result model.

Further, the stochastical sampling method in step 1 is self-service sampling method (Bootstap sampling), i.e., for N number of The original training set of sample, each first one sample of random acquisition are put into sampling set, then the sample are put back to, so gathers N It is secondary, untill obtaining the sampling set of N number of sample.

Further, the stacking in step 3 includes the steps with reference to strategy：

First concentrated from customer value model data and randomly select 45%-55% data samples as training set, while from visitor 20%-30% data samples are randomly selected in the value models data set of family as test set；

One secondary learner of retraining, during secondary learner is trained by each weak learner Hi (i=1, 2nd, 3...N) input of the learning outcome as secondary learner, the output using the result of training set as secondary learner；

Finally test set is predicted once with primary learner, obtains the input sample of secondary learner, then learned with secondary Practise device and forecast sample is once obtained to test set prediction, while the data correlation between input sample and forecast sample is matched and closed The continuous training of system, best model input and the procedure parameter span being optimal under output result are strong so as to obtain Learner H.

Further, described data correlation matching relationship includes customer value mode input data, procedure parameter and defeated The association matching relationship gone out between result three, described procedure parameter be customer value model data in each index weight or Person divides the span of client's classification index, and described output result is regular for the value label or customer segmentation of client.

Further, described customer value model data includes data field, index weights, the model in index system Algorithm and model result.

Further, concentrated from customer value model data and randomly select 50% data sample as training set, while from Customer value model data is concentrated and randomly selects 25% data sample as test set.

By such scheme, the present invention at least has advantages below：The present invention constantly uses according to user, in combination with Different user, for same industry application scenarios, the data mining model of the differentiation of use so that sector application model possesses Automatic study, the ability of real-time optimization, i.e., complete from model construction, from practical application that time, is just constantly learning automatically, Automatic Optimal, including mode input data and model algorithm, it is ensured that model all in optimum state, has evaded conventional model at any time Optimize the manpower brought, time, interests loss, while also ensure model application effect, being connected in client brings huge income.

Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention, And can be practiced according to the content of specification, below with presently preferred embodiments of the present invention and coordinate accompanying drawing describe in detail as after.

Brief description of the drawings

Fig. 1 is the workflow diagram of the present invention.

Embodiment

With reference to the accompanying drawings and examples, the embodiment of the present invention is described in further detail.Implement below Example is used to illustrate the present invention, but is not limited to the scope of the present invention.

Referring to a kind of customer value model optimization side based on machine learning described in Fig. 1 a preferred embodiment of the present invention Method, including the steps：

As a further improvement on the present invention, the stochastical sampling method in step 1 is self-service sampling method (Bootstap Sampling), i.e., for the original training set of N number of sample, each first one sample of random acquisition is put into sampling set, then this Sample is put back to, and so gathers n times, untill obtaining the sampling set of N number of sample.

As a further improvement on the present invention, the stacking in step 3 includes the steps with reference to strategy：

As a further improvement on the present invention, described data correlation matching relationship includes customer value mode input number According to the association matching relationship between, procedure parameter and output result three, described procedure parameter is customer value model data In each index weight or divide client's classification index span, described output result for client value label or Customer segmentation rule.

As a further improvement on the present invention, described customer value model data includes the data word in index system Section, index weights, model algorithm and model result.

As a further improvement on the present invention, concentrated from customer value model data and randomly select 50% data sample conduct Training set, while concentrated from customer value model data and randomly select 25% data sample as test set.

Described above is only the preferred embodiment of the present invention, is not intended to limit the invention, it is noted that for this skill For the those of ordinary skill in art field, without departing from the technical principles of the invention, can also make it is some improvement and Modification, these improvement and modification also should be regarded as protection scope of the present invention.

Claims

A kind of 1. customer value model optimization method based on machine learning, it is characterised in that including the steps：

Step 1：The customer value model data of N number of client's main body different times is extracted by stochastical sampling method, obtains introductory die Type data sample Si (i=1,2,3...N)；

Step 2：Bagging machine learning methods are used respectively to each initial model data sample Si (i=1,2,3...n), Accordingly train N number of independent individual weak learner Hi (i=1,2,3...N)；

Step 3：Strategy is combined by stacking to combine the individual weak learner Hi (i=1,2,3...N) described in step 2 Into learner H one strong；

Step 4：Using the strong learner H that step 3 obtains as optimal models rule, and by existing customer value models data sample Strong learner H is input to, the result that strong learner H is drawn is optimal result model.
2. the customer value model optimization method according to claim 1 based on integrated study Bagging algorithms, its feature It is：Stochastical sampling method in step 1 is self-service sampling method (Bootstap sampling), the i.e. original instruction for N number of sample Practice collection, each first one sample of random acquisition is put into sampling set, then the sample is put back to, so gathers n times, until obtaining N Untill the sampling set of individual sample.
3. the customer value model optimization method according to claim 1 based on integrated study Bagging algorithms, its feature It is：Stacking in step 3 includes the steps with reference to strategy：

First concentrated from customer value model data and randomly select 45%-55% data samples as training set, while from client's valency Value model data is concentrated and randomly selects 20%-30% data samples as test set；

One secondary learner of retraining, during secondary learner is trained by each weak learner Hi (i=1,2, 3...N input of the learning outcome) as secondary learner, the output using the result of training set as secondary learner；

Finally test set is predicted once with primary learner, obtains the input sample of secondary learner, then with secondary learner Forecast sample is once obtained to test set prediction, while to the data correlation matching relationship between input sample and forecast sample Constantly training, best model input and the procedure parameter span being optimal under output result, so as to be learnt by force Device H.
4. the customer value model optimization method according to claim 3 based on integrated study Bagging algorithms, its feature It is：Described data correlation matching relationship include customer value mode input data, procedure parameter and output result three it Between association matching relationship, described procedure parameter be customer value model data in each index weight or division customer class The span of other index, described output result are regular for the value label or customer segmentation of client.
5. the customer value model optimization method according to claim 1 based on integrated study Bagging algorithms, its feature It is：Described customer value model data includes data field, index weights, model algorithm and the model knot in index system Fruit.
6. the customer value model optimization method according to claim 1 based on integrated study Bagging algorithms, its feature It is：Concentrated from customer value model data and randomly select 50% data sample as training set, while from customer value model 25% data sample is randomly selected in data set as test set.