CN107730286A

CN107730286A - A kind of target customer's screening technique and device

Info

Publication number: CN107730286A
Application number: CN201610652832.0A
Authority: CN
Inventors: 赵洪松; 苏燕
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Group Heilongjiang Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Group Heilongjiang Co Ltd
Priority date: 2016-08-10
Filing date: 2016-08-10
Publication date: 2018-02-23

Abstract

The invention provides a kind of target customer's screening technique and device, target customer's screening technique includes：Pre- this target customer for carrying out pushing presupposed information is filtered out in client's sample to be sorted；Obtain and the push feedback result after presupposed information push is carried out to this described target customer；The screening model for screening this target customer is modified according to the push feedback result, to carry out the screening of next target customer according to revised screening model.Such scheme, by using marketing feedback result dynamic adjusting data mining model, improve the accuracy that marketing client screens, substantially increase the hit rate and precision of marketing, marketing effectiveness is improved, considerably reduces cost of marketing, improves marketing input-output ratio and marketing income.

Description

Target customer screening method and device

Technical Field

The invention relates to the technical field of data warehouses, in particular to a target customer screening method and device.

Background

Under the condition that market competition is becoming more intense, how to improve the selection accuracy of a target customer group, improve the marketing success rate and reduce the marketing cost is a problem which is always concerned by operators of various electric companies. At present, generally, a data mining model is established aiming at marketing activities in a special field, and client groups are screened in ways of matching features and user preferences. The data mining model is generally constructed based on historical data analysis, screening rules of corresponding user groups are found out, then marketing client groups are obtained by utilizing the corresponding rules, and the latest marketing feedback results are not utilized in the construction process.

The prior art mainly has the following 2 defects:

1. hysteresis in data mining model adjustment.

The rules screened by data mining often depend on the output index, the completeness of the dimensionality, and the experience of the data mining team. After the model training is verified, the model result is fixed, and the target user group is screened out through the corresponding rule. The input index of the model and the weight of the index are fixed and unchanged, and cannot be automatically adjusted along with the change of the service. Until the performance of the model is reduced to a certain degree, marketing personnel pay attention, and then a large amount of manpower, material resources and time are invested to carry out secondary optimization on the data mining model.

2. Marketing feedback cannot be fully utilized.

The traditional marketing feedback information is only used for evaluating the effect of marketing activities or evaluating the quality of models, the actual feedback information is not fully utilized, and the marketing closed loop in the true sense is not realized.

As the market environment changes, user behavior also changes. The original static model rules or the client labels are increasingly unable to adapt to the changes of the market and the client behaviors, so that the client group screening is not accurate, and the marketing effect is increasingly poor.

Disclosure of Invention

The invention aims to provide a method and a device for screening target customers, which are used for solving the problems of inaccurate screening of marketing customers and poor marketing effect caused by the fact that the existing static data mining model cannot be dynamically adjusted in combination with marketing feedback results.

In order to solve the above technical problem, an embodiment of the present invention provides a target customer screening method, including:

screening current target customers for pushing preset information in a customer sample to be classified;

obtaining a push feedback result after the preset information is pushed to the current target client;

and correcting the screening model for screening the current target client according to the push feedback result so as to screen the next target client according to the corrected screening model.

Further, the step of screening out the target client for pushing the preset information in the client sample to be classified includes:

obtaining a screening model;

respectively calculating the probability value of each customer sample in the customer samples to be classified in each category according to the screening model;

and selecting the current target client for pushing preset information from the client samples to be classified according to the probability value of each client sample.

Further, when the screening model is a naive bayes classifier, the step of obtaining the screening model comprises:

acquiring training data and testing data for constructing a naive Bayes classifier;

constructing an initial classifier by using the training data;

and selecting the classifier by using the test data to obtain a naive Bayes classifier.

Further, the step of constructing an initial classifier using the training data includes:

obtaining the prior probability and the conditional probability of an initial classifier by using training data;

and obtaining the class of each sample in the training data and the test data according to the prior probability and the conditional probability, calculating a training error and a test error, and selecting a classifier with the minimum test error as an initial classifier.

Further, the step of obtaining the prior probability and the conditional probability of the initial classifier by using the training data includes:

using the formula:calculating to obtain the prior probability of the training data on each category;

using the formula:calculating to obtain the conditional probability of the training data;

wherein, the training data T { (x)₁,y₁),(x₂,y₂),...,(x_N,y_N) N represents the total number of samples x contained in the training data; p (Y ═ c)_k) A prior probability represented on category Y; y is_iIs YTarget variable, representing a sample class, and y_i∈{c₁,c₂,...,c_K}，c_kIs a sample class value; i (-) represents solving a sample class probability function; p (X)^(j)＝a_jl|Y＝c_k) Representing conditional probabilities of the training data;represents the jth feature of the ith sample, ana_jlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2_j，k＝1,2,...,K。

Further, the step of obtaining the class to which each sample in the training data and the test data belongs according to the prior probability and the conditional probability, calculating the training error and the test error, and selecting the classifier with the minimum test error as the initial classifier includes:

using the formula:taking the category corresponding to the maximum value obtained by each sample in the training data on the preset category as the category of the sample; wherein,

y represents a sample class having a maximum value; p (Y ═ c)_k) A prior probability represented on category Y; y represents a sample category; c. C_kIs a sample class value; p (X)^(j)＝x^(j)|Y＝c_k) Denotes the conditional probability, x^(j)Target variable for x, x represents a sample instance, and x ═ x (x)⁽¹⁾,x⁽²⁾,...,x⁽ⁿ⁾)^T。

Further, the step of selecting the classifier by using the test data to obtain a naive bayes classifier comprises:

adjusting the concentration of a target variable and a Bayesian estimation parameter in training data for multiple times, calculating an evaluation index value in test data according to the concentration of the target variable and the Bayesian estimation parameter after each adjustment, and obtaining a target event concentration corresponding to the largest evaluation index value and the Bayesian estimation parameter to obtain a naive Bayesian classifier;

wherein the prior probability of the naive Bayes classifier is:

the conditional probability is:

wherein, P_λ(Y＝c_k) Is the prior probability of a naive bayes classifier; y is_iTarget variable of Y, representing a sample class, and Y_i∈{c₁,c₂,...,c_K}，c_kIs a sample class value; i (-) represents solving the sample class probability; lambda is a parameter of Bayesian estimation; p_λ(X^(j)＝a_jl|Y＝c_k) Is the conditional probability of a naive bayes classifier;represents the jth feature of the ith sample, ana_jlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2_j，k＝1,2,...,K。

Further, the evaluation index value utilizes the formula:calculating to obtain;

wherein F1 is an evaluation index value, P is a positive example, which indicates the precision of the initial classifier on the test data, and P is the number of samples predicted as positive example/the number of samples predicted as positive example; r is the recall ratio of the initial classifier on the test data, and R is the number of samples predicted as positive examples/the number of samples actually positive examples.

Further, the step of selecting the current target client pushing preset information from the client samples to be classified according to the probability value of each client sample comprises:

calculating probability values of different types of attributions of the client samples to be classified, selecting samples with target class probability values larger than non-target class probability values, arranging the probability values in descending order from large to small, and selecting the client samples arranged in the front in a preset number as the current target client.

Further, the step of modifying the screening model for screening the current target client according to the push feedback result includes:

comparing the push feedback result with the category label of the current target client, and acquiring a first sample with an unsuccessful push feedback result from the current target client;

and adjusting a screening model according to the first sample.

Further, when the screening model is a naive bayes classifier, the step of adjusting the screening model according to the first sample comprises:

according to the formula:recalculating the prior probability of the naive Bayes classifier;

according to the formula:recalculating the conditional probability of the naive Bayes classifier;

wherein, P₁(Y＝c_k) Representing the prior probability of a naive Bayes classifier; y is_iTarget variable of Y, representing a sample class, and Y_i∈{c₁,c₂,...,c_K}，c_kIs a sample class value; i (-) represents solving the sample class probability; n represents the total number of samples x contained in the training data; p₁(X^(j)＝a_jl|Y＝c_k) Representing a conditional probability of a naive bayes classifier;represents the jth feature of the ith sample, ana_jlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2_j，k＝1,2,...,K；N₁Is the number of the first samples.

An embodiment of the present invention provides a target customer screening apparatus, including:

the screening module is used for screening the current target client which is preset for pushing the preset information from the client samples to be classified;

the feedback result acquisition module is used for acquiring a push feedback result after the preset information is pushed to the current target client;

and the correction module is used for correcting the screening model for screening the current target client according to the push feedback result so as to screen the next target client according to the corrected screening model.

Further, the screening module includes:

the model obtaining submodule is used for obtaining a screening model;

the probability calculation submodule is used for respectively calculating the probability value of each client sample in the client samples to be classified on each category according to the screening model;

and the selection submodule is used for selecting the current target client pushing preset information from the client samples to be classified according to the probability value of each client sample.

Further, the correction module comprises:

the feedback result screening submodule is used for comparing the push feedback result with the category label of the current target client and acquiring the push feedback result as a first sample which is not successfully pushed in the current target client;

and the adjusting submodule is used for adjusting the screening model according to the first sample.

The invention has the beneficial effects that:

according to the scheme, the data mining model is dynamically adjusted by utilizing the marketing feedback result, the accuracy of marketing customer screening is improved, the hit rate and accuracy of marketing are greatly improved, the marketing effect is improved, the marketing cost is greatly reduced, and the marketing input-output ratio and the marketing income are improved.

Drawings

Fig. 1 is a schematic flow chart illustrating a target customer screening method according to a first embodiment of the present invention.

FIG. 2 is a flowchart illustrating an implementation of step 100;

FIG. 3 is a flowchart illustrating an implementation of step 110;

FIG. 4 is a flowchart illustrating an implementation of step 112;

FIG. 5 is a flowchart illustrating an implementation of step 300;

FIG. 6 is a schematic structural diagram of an incremental learning model according to a first embodiment of the present invention;

FIG. 7 is a diagram of a naive Bayes incremental learning-based system architecture according to a first embodiment of the present invention;

fig. 8 is a schematic structural diagram of a target customer screening apparatus according to a second embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.

The invention provides a target customer screening method and device aiming at the problems that the existing static data mining model cannot be dynamically adjusted by combining with a marketing feedback result, so that the screening of marketing customers is inaccurate and the marketing effect is poor.

Example one

As shown in fig. 1, the target customer screening method according to the embodiment of the present invention includes:

step 100, screening current target customers for pushing preset information in a sample of the customers to be classified;

it should be noted that the preset information generally refers to marketing information, for example, information pushed to the user 4G to change the phone, information subscribed to a certain functional service, and the like.

200, acquiring a push feedback result after preset information is pushed to the current target client;

after receiving the preset information, the user makes a corresponding selection according to the information content, and receives the preset information or rejects the preset information, and the push feedback result is a marketing feedback result for marketing, which generally includes results of successful marketing (the user receives marketing push service) and unsuccessful marketing (the user rejects marketing push service).

And 300, correcting the screening model for screening the current target client according to the push feedback result, and screening the next target client according to the corrected screening model.

According to the embodiment of the invention, the data mining model is dynamically adjusted by using the marketing feedback result, so that the screening accuracy of marketing customers is improved, the hit rate and accuracy of marketing are greatly improved, the marketing effect is improved, the marketing cost is greatly reduced, and the marketing input-output ratio and the marketing profit are improved.

Optionally, as shown in fig. 2, when implemented specifically, step 100 includes:

step 110, obtaining a screening model;

it should be noted that, in this embodiment, a naive bayes classifier is adopted as a screening model of the client sample.

Step 120, respectively calculating the probability value of each client sample in the client samples to be classified on each category according to the screening model;

and step 130, selecting the current target client pushing preset information from the client samples to be classified according to the probability value of each client sample.

The step 100 is mainly implemented by screening out a target customer (i.e., a marketing customer) by using a naive bayes classifier, and usually, using a customer sample with a large probability value obtained by using the naive bayes classifier as the marketing customer.

Further, as shown in fig. 3, the step 110, when implemented specifically, includes:

step 111, acquiring training data and test data for constructing a naive Bayes classifier;

step 112, constructing an initial classifier by using the training data;

and 113, selecting the classifier by using the test data to obtain a naive Bayes classifier.

It should be noted that, when constructing the naive bayes classifier, an original classifier is generally constructed by using a part of the client samples as training data, and then the original classifier is tested and optimized by using the test data, so as to obtain the naive bayes classifier.

Specifically, as shown in fig. 4, step 112, when implemented, comprises:

step 1121, obtaining the prior probability and the conditional probability of the initial classifier by using the training data;

step 1122, obtaining the class of each sample in the training data and the test data according to the prior probability and the conditional probability, calculating the training error and the test error, and selecting the classifier with the minimum test error as the initial classifier.

In general, the specific implementation manner of step 1121 is as follows:

wherein, the training data T { (x)₁,y₁),(x₂,y₂),...,(x_N,y_N) N represents the total number of samples x contained in the training data; p (Y ═ c)_k) A prior probability represented on category Y; y is_iTarget variable of Y, representing a sample class, and Y_i∈{c₁,c₂,...,c_K}，c_kIs a sample class value; i (-) represents solving a sample class probability function; p (X)^(j)＝a_jl|Y＝c_k) Representing conditional probabilities of the training data;j-th bit representing i-th sampleIs characterized by thata_jlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2_jK denotes the number of sample classes, 1, 2.

In general, the specific implementation of step 1122 is:

Specifically, the specific implementation manner of step 113 is:

wherein the prior probability of the naive Bayes classifier is:

the conditional probability is:

wherein, P_λ(Y＝c_k) Is the prior probability of a naive bayes classifier; y is_iTarget variable of Y, representing a sample class, and Y_i∈{c₁,c₂,...,c_K}，c_kIs a sample class value; i (-) represents solving the sample class probability; lambda is a Bayesian estimation parameter, and lambda is more than or equal to 0; p_λ(X^(j)＝a_jl|Y＝c_k) Is the conditional probability of a naive bayes classifier;represents the jth feature of the ith sample, ana_jlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2_j，k＝1,2,...,K。

It should be noted that, in the case where the probability value to be estimated is 0 in the maximum likelihood estimation, the calculation result of the posterior probability is affected, and the classification may be biased. To solve this problem, the conditional probability and the prior probability are calculated by using the parameters of the bayesian estimation.

The above evaluation index value is represented by the formula:calculating to obtain;

Optionally, the specific implementation manner of step 130 is:

Taking two problems as an example, calculating to obtain the probability that each sample belongs to a positive example and a negative example, and selecting a plurality of customer samples with the probability in front as the current target customer from the samples with the probability of the positive example being greater than that of the negative example to push the marketing information.

Specifically, as shown in fig. 5, the step 300, when implemented, includes:

step 310, comparing the push feedback result with the category label of the current target client, and acquiring a first sample with the push feedback result being unsuccessful in pushing in the current target client;

it should be noted that the significance of collecting and pushing the unsuccessful customer samples is to compare the difference between the successful customer samples and the unsuccessful customer samples, and further realize the adjustment of the naive bayes classifier through the difference between the successful customer samples and the unsuccessful customer samples.

And step 320, adjusting a screening model according to the first sample.

Further, the specific implementation manner of step 320 is:

It should be noted that an important problem faced by the data mining process is new data that continuously evolves, and the existing classifier needs to continuously adapt to it. When a large batch of data sets are processed, if newly added samples are merged with known samples for post-processing, on one hand, the difficulty of learning is increased, and on the other hand, too much time and storage space are consumed due to too large sample sets. An effective solution is to train the newly added sample sets separately and to provide learning accuracy gradually as the sample sets accumulate. Marketing feedback data pertains to the marked incremental data. Thus, marketing feedback data is utilized in embodiments of the present invention to optimize feature weights of the model. The method specifically comprises the following steps: and (3) the obtained push feedback result (namely incremental data) is firstly checked by using the existing classifier, if the push feedback result is matched with the incremental data, the current classifier is reserved, otherwise, the current classifier is corrected by using a new sample, and the selected new sample is required to be beneficial to improving the classification precision of the current classifier so as to adjust the prior probability and the conditional probability of the naive Bayes classifier. Meanwhile, the new sample is added, so that the sample information is added into the prior probability, and the posterior probability is determined by the prior probability and the newly added sample information.

As shown in fig. 6, the principle of the embodiment of the present invention is: and determining sample knowledge by the prior knowledge, adjusting the sample knowledge by the added new sample, determining the posterior knowledge by the sample knowledge and the prior knowledge, adjusting the prior knowledge by the posterior knowledge, and determining that the prediction result of the test sample is generated by the posterior knowledge.

The following describes the practical application of the embodiment of the present invention with reference to fig. 7 as follows:

take the potential 4G change machine user incremental learning as an example: firstly, extracting a behavior attribute data set related to possible replacement of a User from a data warehouse, such as ARPU (Average income Per User) of the User, an index for measuring business income of telecom operators and Internet companies, call duration, internet behavior characteristics and other data, splitting the data into a training set and a test set according to a proper proportion, constructing a naive Bayes classifier on the training set, and selecting parameters of the classifier by using the test set to obtain an initialized naive Bayes classifier; secondly, extracting samples to be classified from a data warehouse, namely behavior characteristic data of the client, and calculating to obtain possible probability of changing the machine of the client, namely a target client by applying an initialized naive Bayes classifier; through the marketing service device, the target customer is subjected to marketing contact in modes of short messages, manual outbound and the like; a marketing service feedback acquisition device is used for acquiring marketing results and storing marketing result data; on the basis of marketing data acquisition, an incremental learning device is utilized to dynamically adjust the prior probability and the posterior probability corresponding to the initialized classifier, the adjusted model is screened by utilizing a test set, the updated classifier is obtained, and the next round of iterative cycle is entered. In the process, the naive Bayesian classifier continuously updates itself according to the incremental marketing feedback data.

Theoretically, any marketing activity has a target customer group, the target customer group is selected to be conditional, and the conditions can be obtained through continuous learning of the incremental learning device, so that the accuracy of the target customer can be improved through the naive Bayes-based incremental learning system provided by the embodiment of the invention, the marketing effect is improved, and the practical value of the invention can be embodied more as the number of the customer group screening restriction factors is more.

It should be noted that the invention not only enables marketing target customers to be positioned more accurately, but also dynamically constructs a naive Bayesian classifier by utilizing marketing feedback information for the first time, thereby greatly improving the hit rate and accuracy of marketing, improving the marketing effect, greatly reducing the marketing cost, improving the marketing input-output ratio and maximizing the marketing profit.

Example two

As shown in fig. 8, an embodiment of the present invention provides a target customer screening apparatus, including:

the screening module 10 is used for screening the current target customer who is to push preset information from the customer samples to be classified;

a feedback result obtaining module 20, configured to obtain a push feedback result obtained after the preset information is pushed to the current target client;

and the correcting module 30 is configured to correct the screening model for screening the current target customer according to the push feedback result, so as to perform next screening of the target customer according to the corrected screening model.

Specifically, the screening module 10 includes:

the model obtaining submodule is used for obtaining a screening model;

When the screening model is a naive bayes classifier, optionally, the model obtaining sub-module comprises:

the device comprises an acquisition unit, a classification unit and a classification unit, wherein the acquisition unit is used for acquiring training data and test data for constructing a naive Bayes classifier;

the classifier construction unit is used for constructing an initial classifier by using the training data;

and the selecting unit is used for selecting the classifier by using the test data to obtain a naive Bayesian classifier.

Specifically, the classifier building unit is specifically configured to:

Optionally, the specific implementation manner of obtaining the prior probability and the conditional probability of the initial classifier by using the training data is as follows:

wherein, the training data T { (x)₁,y₁),(x₂,y₂),...,(x_N,y_N) N represents the total number of samples x contained in the training data; p (Y ═ c)_k) A prior probability represented on category Y; y is_iIs the order of YA standard variable representing a sample class, and y_i∈{c₁,c₂,...,c_K}，c_kIs a sample class value; i (-) represents solving a sample class probability function; p (X)^(j)＝a_jl|Y＝c_k) Representing conditional probabilities of the training data;represents the jth feature of the ith sample, ana_jlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2_j，k＝1,2,...,K。

Optionally, the specific implementation manner of obtaining the class to which each sample in the training data and the test data belongs according to the prior probability and the conditional probability, calculating the training error and the test error, and selecting the classifier with the smallest test error as the initial classifier is as follows:

Specifically, the selection unit is configured to:

wherein the prior probability of the naive Bayes classifier is:

the conditional probability is:

Wherein the evaluation index value utilizes the formula:calculating to obtain;

Optionally, the selecting submodule is specifically configured to:

Optionally, the modification module 30 includes:

Wherein, when the screening model is a naive Bayesian classifier, the adjusting submodule is specifically configured to:

It should be noted that the apparatus embodiment is an apparatus corresponding to the method, and all implementations of the method embodiment are applicable to the apparatus embodiment, and the same technical effect can be achieved.

While the preferred embodiments of the present invention have been described, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the following claims.

Claims

1. A method for screening target customers, comprising:

2. The method for screening target customers according to claim 1, wherein the step of screening the target customers of the time for pushing preset information in the customer sample to be classified comprises:

obtaining a screening model;

3. The method of claim 2, wherein when the screening model is a naive bayes classifier, the step of obtaining the screening model comprises:

constructing an initial classifier by using the training data;

4. The method of claim 3, wherein the step of constructing an initial classifier using the training data comprises:

5. The method of claim 4, wherein the step of using the training data to obtain the prior probability and the conditional probability of the initial classifier comprises:

wherein, the training data T { (x)₁,y₁),(x₂,y₂),...,(x_N,y_N) N represents the total number of samples x contained in the training data; p (Y ═ c)_k) A prior probability represented on category Y; y is_iTarget variable of Y, representing a sample class, and Y_i∈{c₁,c₂,...,c_K}，c_kIs a sample class value; i (-) represents solving a sample class probability function; p (X)^(j)＝a_jl|Y＝c_k) Representing conditional probabilities of the training data;represents the jth feature of the ith sample, ana_jlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2_j，k＝1,2,...,K。

6. The method for screening target customers according to claim 4, wherein the step of obtaining the class to which each sample in the training data and the test data belongs according to the prior probability and the conditional probability, calculating the training error and the test error, and selecting the classifier with the minimum test error as the initial classifier comprises:

7. The method of claim 3, wherein the step of selecting the classifier using the test data to obtain a naive Bayesian classifier comprises:

wherein the prior probability of the naive Bayes classifier is:the conditional probability is:

8. The method of claim 7, wherein the evaluation index value is calculated using the formula:calculating to obtain;

9. The method for screening target customers according to claim 2, wherein the step of selecting the target customer of the current time for pushing preset information from the customer samples to be classified according to the probability value of each customer sample comprises:

10. The method for screening target customers according to claim 1, wherein the step of modifying the screening model for screening the current target customer according to the push feedback result comprises:

and adjusting a screening model according to the first sample.

11. The method of claim 10, wherein when the screening model is a naive bayes classifier, the step of adjusting the screening model based on the first sample comprises:

12. A target customer screening apparatus, comprising:

13. The targeted customer screening apparatus of claim 12, wherein the screening module comprises:

the model obtaining submodule is used for obtaining a screening model;

14. The targeted customer screening apparatus of claim 12, wherein the revision module comprises: