CN107730286A - A kind of target customer's screening technique and device - Google Patents
A kind of target customer's screening technique and device Download PDFInfo
- Publication number
- CN107730286A CN107730286A CN201610652832.0A CN201610652832A CN107730286A CN 107730286 A CN107730286 A CN 107730286A CN 201610652832 A CN201610652832 A CN 201610652832A CN 107730286 A CN107730286 A CN 107730286A
- Authority
- CN
- China
- Prior art keywords
- sample
- screening
- probability
- classifier
- client
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012216 screening Methods 0.000 title claims abstract description 103
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000012549 training Methods 0.000 claims description 67
- 238000012360 testing method Methods 0.000 claims description 48
- 238000011156 evaluation Methods 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000012937 correction Methods 0.000 claims description 3
- 238000007418 data mining Methods 0.000 abstract description 12
- 230000000694 effects Effects 0.000 description 12
- 230000006399 behavior Effects 0.000 description 5
- 230000008859 change Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000003068 static effect Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
- G06Q30/0202—Market predictions or forecasting for commercial activities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
- G06F18/24155—Bayesian classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0207—Discounts or incentives, e.g. coupons or rebates
- G06Q30/0224—Discounts or incentives, e.g. coupons or rebates based on user history
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Finance (AREA)
- Development Economics (AREA)
- Accounting & Taxation (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Entrepreneurship & Innovation (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- Game Theory and Decision Science (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Probability & Statistics with Applications (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a kind of target customer's screening technique and device, target customer's screening technique includes:Pre- this target customer for carrying out pushing presupposed information is filtered out in client's sample to be sorted;Obtain and the push feedback result after presupposed information push is carried out to this described target customer;The screening model for screening this target customer is modified according to the push feedback result, to carry out the screening of next target customer according to revised screening model.Such scheme, by using marketing feedback result dynamic adjusting data mining model, improve the accuracy that marketing client screens, substantially increase the hit rate and precision of marketing, marketing effectiveness is improved, considerably reduces cost of marketing, improves marketing input-output ratio and marketing income.
Description
Technical Field
The invention relates to the technical field of data warehouses, in particular to a target customer screening method and device.
Background
Under the condition that market competition is becoming more intense, how to improve the selection accuracy of a target customer group, improve the marketing success rate and reduce the marketing cost is a problem which is always concerned by operators of various electric companies. At present, generally, a data mining model is established aiming at marketing activities in a special field, and client groups are screened in ways of matching features and user preferences. The data mining model is generally constructed based on historical data analysis, screening rules of corresponding user groups are found out, then marketing client groups are obtained by utilizing the corresponding rules, and the latest marketing feedback results are not utilized in the construction process.
The prior art mainly has the following 2 defects:
1. hysteresis in data mining model adjustment.
The rules screened by data mining often depend on the output index, the completeness of the dimensionality, and the experience of the data mining team. After the model training is verified, the model result is fixed, and the target user group is screened out through the corresponding rule. The input index of the model and the weight of the index are fixed and unchanged, and cannot be automatically adjusted along with the change of the service. Until the performance of the model is reduced to a certain degree, marketing personnel pay attention, and then a large amount of manpower, material resources and time are invested to carry out secondary optimization on the data mining model.
2. Marketing feedback cannot be fully utilized.
The traditional marketing feedback information is only used for evaluating the effect of marketing activities or evaluating the quality of models, the actual feedback information is not fully utilized, and the marketing closed loop in the true sense is not realized.
As the market environment changes, user behavior also changes. The original static model rules or the client labels are increasingly unable to adapt to the changes of the market and the client behaviors, so that the client group screening is not accurate, and the marketing effect is increasingly poor.
Disclosure of Invention
The invention aims to provide a method and a device for screening target customers, which are used for solving the problems of inaccurate screening of marketing customers and poor marketing effect caused by the fact that the existing static data mining model cannot be dynamically adjusted in combination with marketing feedback results.
In order to solve the above technical problem, an embodiment of the present invention provides a target customer screening method, including:
screening current target customers for pushing preset information in a customer sample to be classified;
obtaining a push feedback result after the preset information is pushed to the current target client;
and correcting the screening model for screening the current target client according to the push feedback result so as to screen the next target client according to the corrected screening model.
Further, the step of screening out the target client for pushing the preset information in the client sample to be classified includes:
obtaining a screening model;
respectively calculating the probability value of each customer sample in the customer samples to be classified in each category according to the screening model;
and selecting the current target client for pushing preset information from the client samples to be classified according to the probability value of each client sample.
Further, when the screening model is a naive bayes classifier, the step of obtaining the screening model comprises:
acquiring training data and testing data for constructing a naive Bayes classifier;
constructing an initial classifier by using the training data;
and selecting the classifier by using the test data to obtain a naive Bayes classifier.
Further, the step of constructing an initial classifier using the training data includes:
obtaining the prior probability and the conditional probability of an initial classifier by using training data;
and obtaining the class of each sample in the training data and the test data according to the prior probability and the conditional probability, calculating a training error and a test error, and selecting a classifier with the minimum test error as an initial classifier.
Further, the step of obtaining the prior probability and the conditional probability of the initial classifier by using the training data includes:
using the formula:calculating to obtain the prior probability of the training data on each category;
using the formula:calculating to obtain the conditional probability of the training data;
wherein, the training data T { (x)1,y1),(x2,y2),...,(xN,yN) N represents the total number of samples x contained in the training data; p (Y ═ c)k) A prior probability represented on category Y; y isiIs YTarget variable, representing a sample class, and yi∈{c1,c2,...,cK},ckIs a sample class value; i (-) represents solving a sample class probability function; p (X)(j)=ajl|Y=ck) Representing conditional probabilities of the training data;represents the jth feature of the ith sample, anajlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2j,k=1,2,...,K。
Further, the step of obtaining the class to which each sample in the training data and the test data belongs according to the prior probability and the conditional probability, calculating the training error and the test error, and selecting the classifier with the minimum test error as the initial classifier includes:
using the formula:taking the category corresponding to the maximum value obtained by each sample in the training data on the preset category as the category of the sample; wherein,
y represents a sample class having a maximum value; p (Y ═ c)k) A prior probability represented on category Y; y represents a sample category; c. CkIs a sample class value; p (X)(j)=x(j)|Y=ck) Denotes the conditional probability, x(j)Target variable for x, x represents a sample instance, and x ═ x (x)(1),x(2),...,x(n))T。
Further, the step of selecting the classifier by using the test data to obtain a naive bayes classifier comprises:
adjusting the concentration of a target variable and a Bayesian estimation parameter in training data for multiple times, calculating an evaluation index value in test data according to the concentration of the target variable and the Bayesian estimation parameter after each adjustment, and obtaining a target event concentration corresponding to the largest evaluation index value and the Bayesian estimation parameter to obtain a naive Bayesian classifier;
wherein the prior probability of the naive Bayes classifier is:
the conditional probability is:
wherein, Pλ(Y=ck) Is the prior probability of a naive bayes classifier; y isiTarget variable of Y, representing a sample class, and Yi∈{c1,c2,...,cK},ckIs a sample class value; i (-) represents solving the sample class probability; lambda is a parameter of Bayesian estimation; pλ(X(j)=ajl|Y=ck) Is the conditional probability of a naive bayes classifier;represents the jth feature of the ith sample, anajlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2j,k=1,2,...,K。
Further, the evaluation index value utilizes the formula:calculating to obtain;
wherein F1 is an evaluation index value, P is a positive example, which indicates the precision of the initial classifier on the test data, and P is the number of samples predicted as positive example/the number of samples predicted as positive example; r is the recall ratio of the initial classifier on the test data, and R is the number of samples predicted as positive examples/the number of samples actually positive examples.
Further, the step of selecting the current target client pushing preset information from the client samples to be classified according to the probability value of each client sample comprises:
calculating probability values of different types of attributions of the client samples to be classified, selecting samples with target class probability values larger than non-target class probability values, arranging the probability values in descending order from large to small, and selecting the client samples arranged in the front in a preset number as the current target client.
Further, the step of modifying the screening model for screening the current target client according to the push feedback result includes:
comparing the push feedback result with the category label of the current target client, and acquiring a first sample with an unsuccessful push feedback result from the current target client;
and adjusting a screening model according to the first sample.
Further, when the screening model is a naive bayes classifier, the step of adjusting the screening model according to the first sample comprises:
according to the formula:recalculating the prior probability of the naive Bayes classifier;
according to the formula:recalculating the conditional probability of the naive Bayes classifier;
wherein, P1(Y=ck) Representing the prior probability of a naive Bayes classifier; y isiTarget variable of Y, representing a sample class, and Yi∈{c1,c2,...,cK},ckIs a sample class value; i (-) represents solving the sample class probability; n represents the total number of samples x contained in the training data; p1(X(j)=ajl|Y=ck) Representing a conditional probability of a naive bayes classifier;represents the jth feature of the ith sample, anajlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2j,k=1,2,...,K;N1Is the number of the first samples.
An embodiment of the present invention provides a target customer screening apparatus, including:
the screening module is used for screening the current target client which is preset for pushing the preset information from the client samples to be classified;
the feedback result acquisition module is used for acquiring a push feedback result after the preset information is pushed to the current target client;
and the correction module is used for correcting the screening model for screening the current target client according to the push feedback result so as to screen the next target client according to the corrected screening model.
Further, the screening module includes:
the model obtaining submodule is used for obtaining a screening model;
the probability calculation submodule is used for respectively calculating the probability value of each client sample in the client samples to be classified on each category according to the screening model;
and the selection submodule is used for selecting the current target client pushing preset information from the client samples to be classified according to the probability value of each client sample.
Further, the correction module comprises:
the feedback result screening submodule is used for comparing the push feedback result with the category label of the current target client and acquiring the push feedback result as a first sample which is not successfully pushed in the current target client;
and the adjusting submodule is used for adjusting the screening model according to the first sample.
The invention has the beneficial effects that:
according to the scheme, the data mining model is dynamically adjusted by utilizing the marketing feedback result, the accuracy of marketing customer screening is improved, the hit rate and accuracy of marketing are greatly improved, the marketing effect is improved, the marketing cost is greatly reduced, and the marketing input-output ratio and the marketing income are improved.
Drawings
Fig. 1 is a schematic flow chart illustrating a target customer screening method according to a first embodiment of the present invention.
FIG. 2 is a flowchart illustrating an implementation of step 100;
FIG. 3 is a flowchart illustrating an implementation of step 110;
FIG. 4 is a flowchart illustrating an implementation of step 112;
FIG. 5 is a flowchart illustrating an implementation of step 300;
FIG. 6 is a schematic structural diagram of an incremental learning model according to a first embodiment of the present invention;
FIG. 7 is a diagram of a naive Bayes incremental learning-based system architecture according to a first embodiment of the present invention;
fig. 8 is a schematic structural diagram of a target customer screening apparatus according to a second embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
The invention provides a target customer screening method and device aiming at the problems that the existing static data mining model cannot be dynamically adjusted by combining with a marketing feedback result, so that the screening of marketing customers is inaccurate and the marketing effect is poor.
Example one
As shown in fig. 1, the target customer screening method according to the embodiment of the present invention includes:
step 100, screening current target customers for pushing preset information in a sample of the customers to be classified;
it should be noted that the preset information generally refers to marketing information, for example, information pushed to the user 4G to change the phone, information subscribed to a certain functional service, and the like.
200, acquiring a push feedback result after preset information is pushed to the current target client;
after receiving the preset information, the user makes a corresponding selection according to the information content, and receives the preset information or rejects the preset information, and the push feedback result is a marketing feedback result for marketing, which generally includes results of successful marketing (the user receives marketing push service) and unsuccessful marketing (the user rejects marketing push service).
And 300, correcting the screening model for screening the current target client according to the push feedback result, and screening the next target client according to the corrected screening model.
According to the embodiment of the invention, the data mining model is dynamically adjusted by using the marketing feedback result, so that the screening accuracy of marketing customers is improved, the hit rate and accuracy of marketing are greatly improved, the marketing effect is improved, the marketing cost is greatly reduced, and the marketing input-output ratio and the marketing profit are improved.
Optionally, as shown in fig. 2, when implemented specifically, step 100 includes:
step 110, obtaining a screening model;
it should be noted that, in this embodiment, a naive bayes classifier is adopted as a screening model of the client sample.
Step 120, respectively calculating the probability value of each client sample in the client samples to be classified on each category according to the screening model;
and step 130, selecting the current target client pushing preset information from the client samples to be classified according to the probability value of each client sample.
The step 100 is mainly implemented by screening out a target customer (i.e., a marketing customer) by using a naive bayes classifier, and usually, using a customer sample with a large probability value obtained by using the naive bayes classifier as the marketing customer.
Further, as shown in fig. 3, the step 110, when implemented specifically, includes:
step 111, acquiring training data and test data for constructing a naive Bayes classifier;
step 112, constructing an initial classifier by using the training data;
and 113, selecting the classifier by using the test data to obtain a naive Bayes classifier.
It should be noted that, when constructing the naive bayes classifier, an original classifier is generally constructed by using a part of the client samples as training data, and then the original classifier is tested and optimized by using the test data, so as to obtain the naive bayes classifier.
Specifically, as shown in fig. 4, step 112, when implemented, comprises:
step 1121, obtaining the prior probability and the conditional probability of the initial classifier by using the training data;
step 1122, obtaining the class of each sample in the training data and the test data according to the prior probability and the conditional probability, calculating the training error and the test error, and selecting the classifier with the minimum test error as the initial classifier.
In general, the specific implementation manner of step 1121 is as follows:
using the formula:calculating to obtain the prior probability of the training data on each category;
using the formula:calculating to obtain the conditional probability of the training data;
wherein, the training data T { (x)1,y1),(x2,y2),...,(xN,yN) N represents the total number of samples x contained in the training data; p (Y ═ c)k) A prior probability represented on category Y; y isiTarget variable of Y, representing a sample class, and Yi∈{c1,c2,...,cK},ckIs a sample class value; i (-) represents solving a sample class probability function; p (X)(j)=ajl|Y=ck) Representing conditional probabilities of the training data;j-th bit representing i-th sampleIs characterized by thatajlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2jK denotes the number of sample classes, 1, 2.
In general, the specific implementation of step 1122 is:
using the formula:taking the category corresponding to the maximum value obtained by each sample in the training data on the preset category as the category of the sample; wherein,
y represents a sample class having a maximum value; p (Y ═ c)k) A prior probability represented on category Y; y represents a sample category; c. CkIs a sample class value; p (X)(j)=x(j)|Y=ck) Denotes the conditional probability, x(j)Target variable for x, x represents a sample instance, and x ═ x (x)(1),x(2),...,x(n))T。
Specifically, the specific implementation manner of step 113 is:
adjusting the concentration of a target variable and a Bayesian estimation parameter in training data for multiple times, calculating an evaluation index value in test data according to the concentration of the target variable and the Bayesian estimation parameter after each adjustment, and obtaining a target event concentration corresponding to the largest evaluation index value and the Bayesian estimation parameter to obtain a naive Bayesian classifier;
wherein the prior probability of the naive Bayes classifier is:
the conditional probability is:
wherein, Pλ(Y=ck) Is the prior probability of a naive bayes classifier; y isiTarget variable of Y, representing a sample class, and Yi∈{c1,c2,...,cK},ckIs a sample class value; i (-) represents solving the sample class probability; lambda is a Bayesian estimation parameter, and lambda is more than or equal to 0; pλ(X(j)=ajl|Y=ck) Is the conditional probability of a naive bayes classifier;represents the jth feature of the ith sample, anajlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2j,k=1,2,...,K。
It should be noted that, in the case where the probability value to be estimated is 0 in the maximum likelihood estimation, the calculation result of the posterior probability is affected, and the classification may be biased. To solve this problem, the conditional probability and the prior probability are calculated by using the parameters of the bayesian estimation.
The above evaluation index value is represented by the formula:calculating to obtain;
wherein F1 is an evaluation index value, P is a positive example, which indicates the precision of the initial classifier on the test data, and P is the number of samples predicted as positive example/the number of samples predicted as positive example; r is the recall ratio of the initial classifier on the test data, and R is the number of samples predicted as positive examples/the number of samples actually positive examples.
Optionally, the specific implementation manner of step 130 is:
calculating probability values of different types of attributions of the client samples to be classified, selecting samples with target class probability values larger than non-target class probability values, arranging the probability values in descending order from large to small, and selecting the client samples arranged in the front in a preset number as the current target client.
Taking two problems as an example, calculating to obtain the probability that each sample belongs to a positive example and a negative example, and selecting a plurality of customer samples with the probability in front as the current target customer from the samples with the probability of the positive example being greater than that of the negative example to push the marketing information.
Specifically, as shown in fig. 5, the step 300, when implemented, includes:
step 310, comparing the push feedback result with the category label of the current target client, and acquiring a first sample with the push feedback result being unsuccessful in pushing in the current target client;
it should be noted that the significance of collecting and pushing the unsuccessful customer samples is to compare the difference between the successful customer samples and the unsuccessful customer samples, and further realize the adjustment of the naive bayes classifier through the difference between the successful customer samples and the unsuccessful customer samples.
And step 320, adjusting a screening model according to the first sample.
Further, the specific implementation manner of step 320 is:
according to the formula:recalculating the prior probability of the naive Bayes classifier;
according to the formula:recalculating the conditional probability of the naive Bayes classifier;
wherein, P1(Y=ck) Representing the prior probability of a naive Bayes classifier; y isiTarget variable of Y, representing a sample class, and Yi∈{c1,c2,...,cK},ckIs a sample class value; i (-) represents solving the sample class probability; n represents the total number of samples x contained in the training data; p1(X(j)=ajl|Y=ck) Representing a conditional probability of a naive bayes classifier;represents the jth feature of the ith sample, anajlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2j,k=1,2,...,K;N1Is the number of the first samples.
It should be noted that an important problem faced by the data mining process is new data that continuously evolves, and the existing classifier needs to continuously adapt to it. When a large batch of data sets are processed, if newly added samples are merged with known samples for post-processing, on one hand, the difficulty of learning is increased, and on the other hand, too much time and storage space are consumed due to too large sample sets. An effective solution is to train the newly added sample sets separately and to provide learning accuracy gradually as the sample sets accumulate. Marketing feedback data pertains to the marked incremental data. Thus, marketing feedback data is utilized in embodiments of the present invention to optimize feature weights of the model. The method specifically comprises the following steps: and (3) the obtained push feedback result (namely incremental data) is firstly checked by using the existing classifier, if the push feedback result is matched with the incremental data, the current classifier is reserved, otherwise, the current classifier is corrected by using a new sample, and the selected new sample is required to be beneficial to improving the classification precision of the current classifier so as to adjust the prior probability and the conditional probability of the naive Bayes classifier. Meanwhile, the new sample is added, so that the sample information is added into the prior probability, and the posterior probability is determined by the prior probability and the newly added sample information.
As shown in fig. 6, the principle of the embodiment of the present invention is: and determining sample knowledge by the prior knowledge, adjusting the sample knowledge by the added new sample, determining the posterior knowledge by the sample knowledge and the prior knowledge, adjusting the prior knowledge by the posterior knowledge, and determining that the prediction result of the test sample is generated by the posterior knowledge.
The following describes the practical application of the embodiment of the present invention with reference to fig. 7 as follows:
take the potential 4G change machine user incremental learning as an example: firstly, extracting a behavior attribute data set related to possible replacement of a User from a data warehouse, such as ARPU (Average income Per User) of the User, an index for measuring business income of telecom operators and Internet companies, call duration, internet behavior characteristics and other data, splitting the data into a training set and a test set according to a proper proportion, constructing a naive Bayes classifier on the training set, and selecting parameters of the classifier by using the test set to obtain an initialized naive Bayes classifier; secondly, extracting samples to be classified from a data warehouse, namely behavior characteristic data of the client, and calculating to obtain possible probability of changing the machine of the client, namely a target client by applying an initialized naive Bayes classifier; through the marketing service device, the target customer is subjected to marketing contact in modes of short messages, manual outbound and the like; a marketing service feedback acquisition device is used for acquiring marketing results and storing marketing result data; on the basis of marketing data acquisition, an incremental learning device is utilized to dynamically adjust the prior probability and the posterior probability corresponding to the initialized classifier, the adjusted model is screened by utilizing a test set, the updated classifier is obtained, and the next round of iterative cycle is entered. In the process, the naive Bayesian classifier continuously updates itself according to the incremental marketing feedback data.
Theoretically, any marketing activity has a target customer group, the target customer group is selected to be conditional, and the conditions can be obtained through continuous learning of the incremental learning device, so that the accuracy of the target customer can be improved through the naive Bayes-based incremental learning system provided by the embodiment of the invention, the marketing effect is improved, and the practical value of the invention can be embodied more as the number of the customer group screening restriction factors is more.
It should be noted that the invention not only enables marketing target customers to be positioned more accurately, but also dynamically constructs a naive Bayesian classifier by utilizing marketing feedback information for the first time, thereby greatly improving the hit rate and accuracy of marketing, improving the marketing effect, greatly reducing the marketing cost, improving the marketing input-output ratio and maximizing the marketing profit.
Example two
As shown in fig. 8, an embodiment of the present invention provides a target customer screening apparatus, including:
the screening module 10 is used for screening the current target customer who is to push preset information from the customer samples to be classified;
a feedback result obtaining module 20, configured to obtain a push feedback result obtained after the preset information is pushed to the current target client;
and the correcting module 30 is configured to correct the screening model for screening the current target customer according to the push feedback result, so as to perform next screening of the target customer according to the corrected screening model.
Specifically, the screening module 10 includes:
the model obtaining submodule is used for obtaining a screening model;
the probability calculation submodule is used for respectively calculating the probability value of each client sample in the client samples to be classified on each category according to the screening model;
and the selection submodule is used for selecting the current target client pushing preset information from the client samples to be classified according to the probability value of each client sample.
When the screening model is a naive bayes classifier, optionally, the model obtaining sub-module comprises:
the device comprises an acquisition unit, a classification unit and a classification unit, wherein the acquisition unit is used for acquiring training data and test data for constructing a naive Bayes classifier;
the classifier construction unit is used for constructing an initial classifier by using the training data;
and the selecting unit is used for selecting the classifier by using the test data to obtain a naive Bayesian classifier.
Specifically, the classifier building unit is specifically configured to:
obtaining the prior probability and the conditional probability of an initial classifier by using training data;
and obtaining the class of each sample in the training data and the test data according to the prior probability and the conditional probability, calculating a training error and a test error, and selecting a classifier with the minimum test error as an initial classifier.
Optionally, the specific implementation manner of obtaining the prior probability and the conditional probability of the initial classifier by using the training data is as follows:
using the formula:calculating to obtain the prior probability of the training data on each category;
using the formula:calculating to obtain the conditional probability of the training data;
wherein, the training data T { (x)1,y1),(x2,y2),...,(xN,yN) N represents the total number of samples x contained in the training data; p (Y ═ c)k) A prior probability represented on category Y; y isiIs the order of YA standard variable representing a sample class, and yi∈{c1,c2,...,cK},ckIs a sample class value; i (-) represents solving a sample class probability function; p (X)(j)=ajl|Y=ck) Representing conditional probabilities of the training data;represents the jth feature of the ith sample, anajlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2j,k=1,2,...,K。
Optionally, the specific implementation manner of obtaining the class to which each sample in the training data and the test data belongs according to the prior probability and the conditional probability, calculating the training error and the test error, and selecting the classifier with the smallest test error as the initial classifier is as follows:
using the formula:taking the category corresponding to the maximum value obtained by each sample in the training data on the preset category as the category of the sample; wherein,
y represents a sample class having a maximum value; p (Y ═ c)k) A prior probability represented on category Y; y represents a sample category; c. CkIs a sample class value; p (X)(j)=x(j)|Y=ck) Denotes the conditional probability, x(j)Target variable for x, x represents a sample instance, and x ═ x (x)(1),x(2),...,x(n))T。
Specifically, the selection unit is configured to:
adjusting the concentration of a target variable and a Bayesian estimation parameter in training data for multiple times, calculating an evaluation index value in test data according to the concentration of the target variable and the Bayesian estimation parameter after each adjustment, and obtaining a target event concentration corresponding to the largest evaluation index value and the Bayesian estimation parameter to obtain a naive Bayesian classifier;
wherein the prior probability of the naive Bayes classifier is:
the conditional probability is:
wherein, Pλ(Y=ck) Is the prior probability of a naive bayes classifier; y isiTarget variable of Y, representing a sample class, and Yi∈{c1,c2,...,cK},ckIs a sample class value; i (-) represents solving the sample class probability; lambda is a parameter of Bayesian estimation; pλ(X(j)=ajl|Y=ck) Is the conditional probability of a naive bayes classifier;represents the jth feature of the ith sample, anajlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2j,k=1,2,...,K。
Wherein the evaluation index value utilizes the formula:calculating to obtain;
wherein F1 is an evaluation index value, P is a positive example, which indicates the precision of the initial classifier on the test data, and P is the number of samples predicted as positive example/the number of samples predicted as positive example; r is the recall ratio of the initial classifier on the test data, and R is the number of samples predicted as positive examples/the number of samples actually positive examples.
Optionally, the selecting submodule is specifically configured to:
calculating probability values of different types of attributions of the client samples to be classified, selecting samples with target class probability values larger than non-target class probability values, arranging the probability values in descending order from large to small, and selecting the client samples arranged in the front in a preset number as the current target client.
Optionally, the modification module 30 includes:
the feedback result screening submodule is used for comparing the push feedback result with the category label of the current target client and acquiring the push feedback result as a first sample which is not successfully pushed in the current target client;
and the adjusting submodule is used for adjusting the screening model according to the first sample.
Wherein, when the screening model is a naive Bayesian classifier, the adjusting submodule is specifically configured to:
according to the formula:recalculating the prior probability of the naive Bayes classifier;
according to the formula:recalculating the conditional probability of the naive Bayes classifier;
wherein, P1(Y=ck) Representing the prior probability of a naive Bayes classifier; y isiTarget variable of Y, representing a sample class, and Yi∈{c1,c2,...,cK},ckIs a sample class value; i (-) represents solving the sample class probability; n represents the total number of samples x contained in the training data; p1(X(j)=ajl|Y=ck) Representing a conditional probability of a naive bayes classifier;represents the jth feature of the ith sample, anajlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2j,k=1,2,...,K;N1Is the number of the first samples.
It should be noted that the apparatus embodiment is an apparatus corresponding to the method, and all implementations of the method embodiment are applicable to the apparatus embodiment, and the same technical effect can be achieved.
While the preferred embodiments of the present invention have been described, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the following claims.
Claims (14)
1. A method for screening target customers, comprising:
screening current target customers for pushing preset information in a customer sample to be classified;
obtaining a push feedback result after the preset information is pushed to the current target client;
and correcting the screening model for screening the current target client according to the push feedback result so as to screen the next target client according to the corrected screening model.
2. The method for screening target customers according to claim 1, wherein the step of screening the target customers of the time for pushing preset information in the customer sample to be classified comprises:
obtaining a screening model;
respectively calculating the probability value of each customer sample in the customer samples to be classified in each category according to the screening model;
and selecting the current target client for pushing preset information from the client samples to be classified according to the probability value of each client sample.
3. The method of claim 2, wherein when the screening model is a naive bayes classifier, the step of obtaining the screening model comprises:
acquiring training data and testing data for constructing a naive Bayes classifier;
constructing an initial classifier by using the training data;
and selecting the classifier by using the test data to obtain a naive Bayes classifier.
4. The method of claim 3, wherein the step of constructing an initial classifier using the training data comprises:
obtaining the prior probability and the conditional probability of an initial classifier by using training data;
and obtaining the class of each sample in the training data and the test data according to the prior probability and the conditional probability, calculating a training error and a test error, and selecting a classifier with the minimum test error as an initial classifier.
5. The method of claim 4, wherein the step of using the training data to obtain the prior probability and the conditional probability of the initial classifier comprises:
using the formula:calculating to obtain the prior probability of the training data on each category;
using the formula:calculating to obtain the conditional probability of the training data;
wherein, the training data T { (x)1,y1),(x2,y2),...,(xN,yN) N represents the total number of samples x contained in the training data; p (Y ═ c)k) A prior probability represented on category Y; y isiTarget variable of Y, representing a sample class, and Yi∈{c1,c2,...,cK},ckIs a sample class value; i (-) represents solving a sample class probability function; p (X)(j)=ajl|Y=ck) Representing conditional probabilities of the training data;represents the jth feature of the ith sample, anajlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2j,k=1,2,...,K。
6. The method for screening target customers according to claim 4, wherein the step of obtaining the class to which each sample in the training data and the test data belongs according to the prior probability and the conditional probability, calculating the training error and the test error, and selecting the classifier with the minimum test error as the initial classifier comprises:
using the formula:taking the category corresponding to the maximum value obtained by each sample in the training data on the preset category as the category of the sample; wherein,
y represents a sample class having a maximum value; p (Y ═ c)k) A prior probability represented on category Y; y represents a sample category; c. CkIs a sample class value; p (X)(j)=x(j)|Y=ck) Denotes the conditional probability, x(j)Target variable for x, x represents a sample instance, and x ═ x (x)(1),x(2),...,x(n))T。
7. The method of claim 3, wherein the step of selecting the classifier using the test data to obtain a naive Bayesian classifier comprises:
adjusting the concentration of a target variable and a Bayesian estimation parameter in training data for multiple times, calculating an evaluation index value in test data according to the concentration of the target variable and the Bayesian estimation parameter after each adjustment, and obtaining a target event concentration corresponding to the largest evaluation index value and the Bayesian estimation parameter to obtain a naive Bayesian classifier;
wherein the prior probability of the naive Bayes classifier is:the conditional probability is:
wherein, Pλ(Y=ck) Is the prior probability of a naive bayes classifier; y isiTarget variable of Y, representing a sample class, and Yi∈{c1,c2,...,cK},ckIs a sample class value; i (-) represents solving the sample class probability; lambda is a parameter of Bayesian estimation; pλ(X(j)=ajl|Y=ck) Is the conditional probability of a naive bayes classifier;represents the jth feature of the ith sample, anajlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2j,k=1,2,...,K。
8. The method of claim 7, wherein the evaluation index value is calculated using the formula:calculating to obtain;
wherein F1 is an evaluation index value, P is a positive example, which indicates the precision of the initial classifier on the test data, and P is the number of samples predicted as positive example/the number of samples predicted as positive example; r is the recall ratio of the initial classifier on the test data, and R is the number of samples predicted as positive examples/the number of samples actually positive examples.
9. The method for screening target customers according to claim 2, wherein the step of selecting the target customer of the current time for pushing preset information from the customer samples to be classified according to the probability value of each customer sample comprises:
calculating probability values of different types of attributions of the client samples to be classified, selecting samples with target class probability values larger than non-target class probability values, arranging the probability values in descending order from large to small, and selecting the client samples arranged in the front in a preset number as the current target client.
10. The method for screening target customers according to claim 1, wherein the step of modifying the screening model for screening the current target customer according to the push feedback result comprises:
comparing the push feedback result with the category label of the current target client, and acquiring a first sample with an unsuccessful push feedback result from the current target client;
and adjusting a screening model according to the first sample.
11. The method of claim 10, wherein when the screening model is a naive bayes classifier, the step of adjusting the screening model based on the first sample comprises:
according to the formula:recalculating the prior probability of the naive Bayes classifier;
according to the formula:recalculating the conditional probability of the naive Bayes classifier;
wherein, P1(Y=ck) Representing the prior probability of a naive Bayes classifier; y isiTarget variable of Y, representing a sample class, and Yi∈{c1,c2,...,cK},ckIs a sample class value; i (-) represents solving the sample class probability; n represents the total number of samples x contained in the training data; p1(X(j)=ajl|Y=ck) Representing a conditional probability of a naive bayes classifier;represents the jth feature of the ith sample, anajlIt is the jth feature that takes the ith value; j 1,2, n, l 1,2j,k=1,2,...,K;N1Is the number of the first samples.
12. A target customer screening apparatus, comprising:
the screening module is used for screening the current target client which is preset for pushing the preset information from the client samples to be classified;
the feedback result acquisition module is used for acquiring a push feedback result after the preset information is pushed to the current target client;
and the correction module is used for correcting the screening model for screening the current target client according to the push feedback result so as to screen the next target client according to the corrected screening model.
13. The targeted customer screening apparatus of claim 12, wherein the screening module comprises:
the model obtaining submodule is used for obtaining a screening model;
the probability calculation submodule is used for respectively calculating the probability value of each client sample in the client samples to be classified on each category according to the screening model;
and the selection submodule is used for selecting the current target client pushing preset information from the client samples to be classified according to the probability value of each client sample.
14. The targeted customer screening apparatus of claim 12, wherein the revision module comprises:
the feedback result screening submodule is used for comparing the push feedback result with the category label of the current target client and acquiring the push feedback result as a first sample which is not successfully pushed in the current target client;
and the adjusting submodule is used for adjusting the screening model according to the first sample.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610652832.0A CN107730286A (en) | 2016-08-10 | 2016-08-10 | A kind of target customer's screening technique and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610652832.0A CN107730286A (en) | 2016-08-10 | 2016-08-10 | A kind of target customer's screening technique and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107730286A true CN107730286A (en) | 2018-02-23 |
Family
ID=61199431
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610652832.0A Pending CN107730286A (en) | 2016-08-10 | 2016-08-10 | A kind of target customer's screening technique and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107730286A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109191210A (en) * | 2018-09-13 | 2019-01-11 | 厦门大学嘉庚学院 | A kind of broadband target user's recognition methods based on Adaboost algorithm |
CN110046770A (en) * | 2019-04-23 | 2019-07-23 | 中国科学技术大学 | Grain mildew prediction technique and device |
CN110062023A (en) * | 2019-03-12 | 2019-07-26 | 阿里巴巴集团控股有限公司 | A kind of safety education information-pushing method, device and equipment |
CN110119466A (en) * | 2019-03-29 | 2019-08-13 | 五渡(杭州)科技有限责任公司 | A kind of big data intelligent marketing system and method |
CN110457566A (en) * | 2019-08-15 | 2019-11-15 | 腾讯科技(武汉)有限公司 | Method, device, electronic equipment and storage medium |
CN110619534A (en) * | 2018-06-19 | 2019-12-27 | 华为技术有限公司 | Marketing information pushing method and device and readable storage medium |
CN111612492A (en) * | 2019-02-26 | 2020-09-01 | 北京奇虎科技有限公司 | User online accurate marketing method and device based on multi-feature fusion |
CN111768040A (en) * | 2020-07-01 | 2020-10-13 | 深圳前海微众银行股份有限公司 | Model interpretation method, device, equipment and readable storage medium |
CN111797942A (en) * | 2020-07-23 | 2020-10-20 | 深圳壹账通智能科技有限公司 | User information classification method and device, computer equipment and storage medium |
CN112581191A (en) * | 2020-08-14 | 2021-03-30 | 支付宝(杭州)信息技术有限公司 | Training method and device of behavior prediction model |
CN113065885A (en) * | 2021-03-01 | 2021-07-02 | 苏宁金融科技(南京)有限公司 | Method and system for intelligent marketing |
CN114418376A (en) * | 2022-01-14 | 2022-04-29 | 上海明胜品智人工智能科技有限公司 | Enterprise task issuing method and device, electronic equipment and storage medium |
CN114757724A (en) * | 2022-06-14 | 2022-07-15 | 湖南三湘银行股份有限公司 | Precise information pushing system and method based on genetic algorithm |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102982042A (en) * | 2011-09-07 | 2013-03-20 | 中国移动通信集团公司 | Personalization content recommendation method and platform and system |
CN103150696A (en) * | 2011-12-06 | 2013-06-12 | 中兴通讯股份有限公司 | Method and device for selecting potential customer of target value-added service |
CN104391860A (en) * | 2014-10-22 | 2015-03-04 | 安一恒通(北京)科技有限公司 | Content type detection method and device |
CN105608592A (en) * | 2014-08-21 | 2016-05-25 | 深圳市深讯信息科技发展股份有限公司 | Telecom user intelligent analyzing and pushing method and telecom user intelligent analyzing and pushing system |
-
2016
- 2016-08-10 CN CN201610652832.0A patent/CN107730286A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102982042A (en) * | 2011-09-07 | 2013-03-20 | 中国移动通信集团公司 | Personalization content recommendation method and platform and system |
CN103150696A (en) * | 2011-12-06 | 2013-06-12 | 中兴通讯股份有限公司 | Method and device for selecting potential customer of target value-added service |
CN105608592A (en) * | 2014-08-21 | 2016-05-25 | 深圳市深讯信息科技发展股份有限公司 | Telecom user intelligent analyzing and pushing method and telecom user intelligent analyzing and pushing system |
CN104391860A (en) * | 2014-10-22 | 2015-03-04 | 安一恒通(北京)科技有限公司 | Content type detection method and device |
Non-Patent Citations (2)
Title |
---|
唐炉亮 等: "一种基于朴素贝叶斯分类的车道数量探测", 《中国公路学报》 * |
高影繁 等: "一种结合参数优化的贝叶斯文本分类算法", 《计算机研究与发展》 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110619534A (en) * | 2018-06-19 | 2019-12-27 | 华为技术有限公司 | Marketing information pushing method and device and readable storage medium |
CN109191210A (en) * | 2018-09-13 | 2019-01-11 | 厦门大学嘉庚学院 | A kind of broadband target user's recognition methods based on Adaboost algorithm |
CN111612492A (en) * | 2019-02-26 | 2020-09-01 | 北京奇虎科技有限公司 | User online accurate marketing method and device based on multi-feature fusion |
CN110062023A (en) * | 2019-03-12 | 2019-07-26 | 阿里巴巴集团控股有限公司 | A kind of safety education information-pushing method, device and equipment |
CN110062023B (en) * | 2019-03-12 | 2021-08-31 | 创新先进技术有限公司 | Safety education information pushing method, device and equipment |
CN110119466A (en) * | 2019-03-29 | 2019-08-13 | 五渡(杭州)科技有限责任公司 | A kind of big data intelligent marketing system and method |
CN110046770B (en) * | 2019-04-23 | 2021-04-23 | 中国科学技术大学 | Grain mildew prediction method and device |
CN110046770A (en) * | 2019-04-23 | 2019-07-23 | 中国科学技术大学 | Grain mildew prediction technique and device |
CN110457566A (en) * | 2019-08-15 | 2019-11-15 | 腾讯科技(武汉)有限公司 | Method, device, electronic equipment and storage medium |
CN111768040A (en) * | 2020-07-01 | 2020-10-13 | 深圳前海微众银行股份有限公司 | Model interpretation method, device, equipment and readable storage medium |
CN111797942A (en) * | 2020-07-23 | 2020-10-20 | 深圳壹账通智能科技有限公司 | User information classification method and device, computer equipment and storage medium |
CN112581191A (en) * | 2020-08-14 | 2021-03-30 | 支付宝(杭州)信息技术有限公司 | Training method and device of behavior prediction model |
CN113065885A (en) * | 2021-03-01 | 2021-07-02 | 苏宁金融科技(南京)有限公司 | Method and system for intelligent marketing |
CN114418376A (en) * | 2022-01-14 | 2022-04-29 | 上海明胜品智人工智能科技有限公司 | Enterprise task issuing method and device, electronic equipment and storage medium |
CN114757724A (en) * | 2022-06-14 | 2022-07-15 | 湖南三湘银行股份有限公司 | Precise information pushing system and method based on genetic algorithm |
CN114757724B (en) * | 2022-06-14 | 2022-09-20 | 湖南三湘银行股份有限公司 | Precise information pushing system and method based on genetic algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107730286A (en) | A kind of target customer's screening technique and device | |
US8498950B2 (en) | System for training classifiers in multiple categories through active learning | |
US11436434B2 (en) | Machine learning techniques to identify predictive features and predictive values for each feature | |
CN103617435B (en) | Image sorting method and system for active learning | |
CN106327240A (en) | Recommendation method and recommendation system based on GRU neural network | |
CN111797320B (en) | Data processing method, device, equipment and storage medium | |
CN112633962B (en) | Service recommendation method and device, computer equipment and storage medium | |
CN110599295B (en) | Method, device and equipment for pushing articles | |
CN112149352B (en) | Prediction method for marketing activity clicking by combining GBDT automatic characteristic engineering | |
CN105550295A (en) | Classification model optimization method and classification model optimization apparatus | |
CN112669078A (en) | Behavior prediction model training method, device, equipment and storage medium | |
Zhou et al. | Longitudinal impact of preference biases on recommender systems’ performance | |
CN113407854A (en) | Application recommendation method, device and equipment and computer readable storage medium | |
CN113570398A (en) | Promotion data processing method, model training method, system and storage medium | |
CN114782123A (en) | Credit assessment method and system | |
CN113378067B (en) | Message recommendation method, device and medium based on user mining | |
CN110457387A (en) | A kind of method and relevant apparatus determining applied to user tag in network | |
CN110766086B (en) | Method and device for fusing multiple classification models based on reinforcement learning model | |
CN116912016A (en) | Bill auditing method and device | |
CN112214675B (en) | Method, device, equipment and computer storage medium for determining user purchasing machine | |
CN108038735A (en) | Data creation method and device | |
CN117217711B (en) | Automatic auditing method and system for communication fee receipt | |
CN112101776A (en) | Crowdsourcing task work group determination method | |
CN115619292B (en) | Method and device for problem management | |
CN113034167A (en) | User interest analysis method and advertisement delivery method based on user behaviors |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180223 |