CN106204083A - A kind of targeted customer's sorting technique, Apparatus and system - Google Patents
A kind of targeted customer's sorting technique, Apparatus and system Download PDFInfo
- Publication number
- CN106204083A CN106204083A CN201510219456.1A CN201510219456A CN106204083A CN 106204083 A CN106204083 A CN 106204083A CN 201510219456 A CN201510219456 A CN 201510219456A CN 106204083 A CN106204083 A CN 106204083A
- Authority
- CN
- China
- Prior art keywords
- class
- subscriber
- probability
- characteristic attribute
- targeted customer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of targeted customer's sorting technique, Apparatus and system, comprise determining that the probability of each class of subscriber in training sample, and under each class of subscriber the probability of each characteristic attribute group, under this each class of subscriber, the probability of characteristic attribute group is in the training sample under this class of subscriber, in this feature set of properties, each characteristic attribute meets the ratio of pre-conditioned training sample quantity corresponding to this feature set of properties and the training sample quantity of this class of subscriber, separate between each characteristic attribute group;Use Bayesian formula, estimate according to the conditional probability of each characteristic attribute group under the probability of each class of subscriber determined and each user, determine the targeted customer to be sorted posterior probability in each classification;Classification corresponding for posterior probability maximum is defined as the class of subscriber of described targeted customer to be sorted.Use the scheme of the embodiment of the present invention, improve the accuracy of targeted customer's classification.
Description
Technical field
The present invention relates to areas of information technology, particularly relate to a kind of targeted customer's sorting technique, Apparatus and system.
Background technology
Data mining technology was widely applied in recent years, classification be data mining technology main contents it
One, the most perfect along with related algorithm, sorting algorithm has been applied in every field.Bank, operator,
The service occupatioies such as supermarket when carrying out the promotion of new product or correlated activation, different users can be carried out for
Property publicity, targeted customer is the basis of accurately marketing accurately, only determines certain classification in the consumer group
Mark user, could launch effectively have affairs of marketing targetedly.Therefore, how targeted customer is carried out
Effective classification becomes the emphasis that every profession and trade is paid close attention to.
The existing sorting technique classifying targeted customer mainly uses traditional decision-tree and Bayes side
Method.Wherein, owing to bayes method is the combination of acyclic figure and probability theory, there is solid probability
Theoretical basis and be widely used.For all of user data, characteristic attribute characterizes the relevant of user
Information, as a example by mobile phone user: the sex of user, age, length of surfing the Net, monthly flow, flow package
Value, talk times, call rate etc. are all characteristic attributes.And when pushing the services such as product to user, can root
Push according to different classes of user, such as: can be using the age user more than 30 years old as the first mesh
Marking class of subscriber, the age is not more than the user of 30 years old as second targeted customer's classification.To targeted customer
When classifying, it is first determined the probability that each classification occurs in data sample, and each characteristic attribute
Conditional probability estimation i.e. prior probability to each classification, by bayesian algorithm, uses fixed condition
Probability Estimation, calculates the targeted customer to be sorted posterior probability in each classification, and maximum a posteriori probability is corresponding
Classification as the classification of targeted customer to be sorted.
The above-mentioned bayes method used of classifying targeted customer needs to assume that each characteristic attribute is the most solely
Vertical, but be that there is certain dependency between the characteristic attribute of actually user data, the most this independence
It is inaccurate that the hypothesis of property makes targeted customer classify.
Summary of the invention
The embodiment of the present invention provides a kind of targeted customer's sorting technique, Apparatus and system, in order to solve existing skill
The problem that present in art, targeted customer's classification accuracy is low.
The embodiment of the present invention provides a kind of targeted customer's sorting technique, including:
Determine the probability of each class of subscriber in training sample, and each characteristic attribute group under each class of subscriber
Conditional probability estimate, the probability of described class of subscriber is quantity and the training of training sample under this class of subscriber
The ratio of total sample number amount, under described each class of subscriber, the conditional probability of characteristic attribute group is estimated as in this use
In training sample under the classification of family, in this feature set of properties, to meet this feature set of properties corresponding for each characteristic attribute
The ratio of the training sample quantity of pre-conditioned training sample quantity and this class of subscriber;Described characteristic attribute
Group includes the characteristic attribute with dependency extracted in all characteristic attributes of described training sample, and each spy
Levying between set of properties separate, described characteristic attribute characterizes the feature of training sample data;
Use Bayesian formula, according to each characteristic attribute under the probability of each class of subscriber determined and each user
The conditional probability of group is estimated, determines the targeted customer to be sorted posterior probability in each classification;
Classification corresponding for posterior probability maximum is defined as the class of subscriber of described targeted customer to be sorted.
The said method provided by the embodiment of the present invention, is belonged to the characteristic attribute composition characteristic with dependency
Property separate between group, and characteristic attribute group, meet use bayes method each parameter separate
Assumed condition, when therefore classifying targeted customer, improves the accuracy of targeted customer's classification.
Further, described targeted customer to be sorted uses equation below true in the posterior probability of each classification
Fixed:
Wherein, CiFor i-th class of subscriber, 1≤i≤m, m are the total quantity of class of subscriber, P (Xkj|Ci)
Each characteristic attribute of expression kth characteristic attribute group is when pre-conditioned j, at class of subscriber CiLower kth
The conditional probability of characteristic attribute group is estimated, n is characterized the quantity of set of properties, and r is pre-conditioned number,
P(Ci) represent class of subscriber CiThe probability occurred, and P (X | Ci) represent that targeted customer X to be sorted is at class of subscriber
CiPosterior probability.
Further, said method, also include:
Before classification corresponding for posterior probability maximum is defined as the classification of described targeted customer to be sorted, will
The maximum posterior probability determined compares with default risk control coefficient, and after determining described maximum
Test probability more than the risk control coefficient preset.
Further, said method, also include:
When the posterior probability determining described maximum is not more than the risk control coefficient preset, gives up and treat described
The classification of class object user judges.
So, maximum posterior probability is not more than the targeted customer to be sorted house of the risk control coefficient preset
Abandon, reduce marketing risk, marketing success rate can be improved.
The embodiment of the present invention additionally provides a kind of targeted customer's sorter, including:
First determines unit, for determining the probability of each class of subscriber in training sample, and each user
Under classification, the conditional probability of each characteristic attribute group is estimated, the probability of described class of subscriber is to instruct under this class of subscriber
Practice the quantity of sample and the ratio of training sample total quantity, the bar of characteristic attribute group under described each class of subscriber
Part probability Estimation is in the training sample under this class of subscriber, and in this feature set of properties, each characteristic attribute meets
The training sample quantity of pre-conditioned training sample quantity corresponding to this feature set of properties and this class of subscriber
Ratio;Having that described characteristic attribute group includes extracting in all characteristic attributes of described training sample is relevant
Property characteristic attribute, and separate between each characteristic attribute group, described characteristic attribute characterizes number of training
According to feature;
Second determines unit, is used for using Bayesian formula, according to the probability of each class of subscriber determined and every
Under individual user, the conditional probability of each characteristic attribute group is estimated, determines that targeted customer to be sorted is after each classification
Test probability;
3rd determines unit, uses for classification corresponding for posterior probability maximum is defined as described target to be sorted
The class of subscriber at family.
The said apparatus provided by the embodiment of the present invention, is belonged to the characteristic attribute composition characteristic with dependency
Property separate between group, and characteristic attribute group, meet use bayes method each parameter separate
Assumed condition, when therefore classifying targeted customer, improves the accuracy of targeted customer's classification.
Further, described second determines unit, specifically for using equation below to determine described mesh to be sorted
Mark user is in the posterior probability of each classification:
Wherein, CiFor i-th class of subscriber, 1≤i≤m, m are the total quantity of class of subscriber, P (Xkj|Ci)
Each characteristic attribute of expression kth characteristic attribute group is when pre-conditioned j, at class of subscriber CiLower kth
The conditional probability of characteristic attribute group is estimated, n is characterized the quantity of set of properties, and r is pre-conditioned number,
P(Ci) represent class of subscriber CiThe probability occurred, and P (X | Ci) represent that targeted customer X to be sorted is at class of subscriber
CiPosterior probability.
Further, said apparatus, also include:
Comparing unit, for being defined as described targeted customer to be sorted by classification corresponding for posterior probability maximum
Classification before, the maximum posterior probability determined is compared with default risk control coefficient, and really
The posterior probability of fixed described maximum is more than the risk control coefficient preset.
Further, said apparatus, also include:
Give up unit, for being not more than, when the posterior probability determining described maximum, the risk control coefficient preset
Time, give up the classification to described targeted customer to be sorted and judge.
So, maximum posterior probability is not more than the targeted customer to be sorted house of the risk control coefficient preset
Abandon, reduce marketing risk, marketing success rate can be improved.
The embodiment of the present invention additionally provides a kind of targeted customer's categorizing system, including:
The targeted customer's sorter provided such as above-described embodiment.
Other features and advantage will illustrate in the following description, and, partly from explanation
Book becomes apparent, or understands by implementing the application.The purpose of the application and other advantages can
Realize by structure specifically noted in the description write, claims and accompanying drawing and obtain
?.
Accompanying drawing explanation
Accompanying drawing is for providing a further understanding of the present invention, and constitutes a part for description, with this
Bright embodiment is used for explaining the present invention together, is not intended that limitation of the present invention.In the accompanying drawings:
The flow chart of targeted customer's sorting technique that Fig. 1 provides for the embodiment of the present invention;
The flow chart of targeted customer's sorting technique that Fig. 2 provides for the embodiment of the present invention 1;
The structural representation of targeted customer's sorter that Fig. 3 provides for the embodiment of the present invention 2.
Detailed description of the invention
In order to provide the implementation improving targeted customer's classification accuracy, embodiments provide one
Targeted customer's sorting technique, Apparatus and system, below in conjunction with Figure of description to the preferred embodiments of the present invention
Illustrate, it will be appreciated that preferred embodiment described herein is merely to illustrate and explains the present invention, and
It is not used in the restriction present invention.And in the case of not conflicting, in embodiment in the application and embodiment
Feature can be mutually combined.
The embodiment of the present invention provides a kind of targeted customer's sorting technique, as it is shown in figure 1, include:
Step 101, determine the probability of each class of subscriber in training sample, and each under each class of subscriber
The probability of characteristic attribute group, the probability of this class of subscriber is quantity and the training of training sample under this class of subscriber
The ratio of total sample number amount, under this each class of subscriber, the probability of characteristic attribute group is under this class of subscriber
In training sample, in this feature set of properties, each characteristic attribute meets corresponding pre-conditioned of this feature set of properties
The ratio of the training sample quantity of training sample quantity and this class of subscriber, this feature set of properties includes training sample
The characteristic attribute with dependency extracted in this all characteristic attributes, and between each characteristic attribute group mutually
Independent, the feature of this feature attribute characterization training sample data.
Step 102, use Bayesian formula, according under the probability of each class of subscriber determined and each user
The probability of each characteristic attribute group, determines the targeted customer to be sorted posterior probability in each classification.
Step 103, classification corresponding for posterior probability maximum is defined as the user class of this targeted customer to be sorted
Not.
In the embodiment of the present invention, the sorting technique of targeted customer can apply the accurate battalion in each businessman or enterprise
In pin service, for a kind of marketing service, training sample can be to use the early stage under this marketing service
Cross the basic data of each user of this service, acquire by the way of stochastic sampling.In this marketing service
Under, a user data is an example.Wherein, characteristic attribute characterizes the feature of training sample, to move
As a example by dynamic service, training sample includes that early stage user uses the various data of Information Mobile Service, characteristic attribute respectively
May include that user's sex, age, length of surfing the Net, monthly flow, flow package value, talk times, lead to
Telephone expenses etc..
All of characteristic attribute in training sample for a kind of marketing service, extracts the feature with dependency
Attribute, will have the characteristic attribute constitutive characteristic set of properties of dependency, separate between characteristic attribute group.
Concrete, the quantity of characteristic attribute group can be arranged flexibly according to different marketing services.
Wherein, class of subscriber is the type pushing specific product for user set in advance.
Below in conjunction with the accompanying drawings, the method and device provided the present invention with specific embodiment and corresponding system are carried out
Describe in detail.
Embodiment 1:
The flow chart of targeted customer's sorting technique that Fig. 2 provides for the embodiment of the present invention 1, specifically includes as follows
Process step:
Step 201, construction feature set of properties.
In the present embodiment, for a kind of marketing service, the early stage under this marketing service was used this service
The basic data of each user as raw sample data, the basic data of each user is an original sample
Data, randomly draw the raw sample data of default sample size as training sample.In raw sample data
Including various characteristic attributes, in conjunction with the data characteristics of this marketing service, all of characteristic attribute selects tool
There is the characteristic attribute constitutive characteristic set of properties of dependency.Such as: as a example by Information Mobile Service, can will have phase
The characteristic attribute of closing property is divided into several groups, and flow group (comprises flow ARPU (Average Revenue
Per-User, every user's average income), monthly flow, super set meal flow, flow package is worth), terminal
Group (comprise terminal standard, machine age), phone group (comprises talk times, call rate), and customer charge group (is used
The monthly expense in family).Above-mentioned this mode is that all characteristic attributes with dependency are divided into a feature belong to
Property group, further, it is also possible to from all characteristic attributes with dependency select Partial Feature attribute make
It is characterized set of properties, such as: flow group (monthly flow, flow ARPU), set of terminal ((machine can be selected
Age), phone group's (talk times, call rate), customer charge group (the monthly expense of user) belongs to as feature
Property group.
Assume that s characteristic attribute is respectively A1, A2... As, the quantity of class of subscriber is m, point
Wei C1, C2... Cm, in training sample evidence, each characteristic attribute value is respectively (X1, X2...
Xs), build n characteristic attribute group and be respectively B1=(A1, A2, A3), B2=(A4, A6), B3=(A5)……
Bn.Below as a example by concrete training sample, it is assumed that training sample is divided into 2 class of subscribers, C1For 4G
Set meal user, C2For non-4G set meal user, the quantity of training sample is 50,000 users, wherein, 5000
Individual user is 4G set meal user, and 45000 users are non-4G set meal users.
Step 202, determine the probability of each class of subscriber in this training sample.
In this step, class of subscriber C1Probability P (the C occurred1)=5000/50000=0.1, class of subscriber C2
Probability P (the C occurred2)=45000/50000=0.9.
Step 203, determine each characteristic attribute group under each class of subscriber conditional probability estimate.
In this step, under each class of subscriber, the conditional probability of each characteristic attribute group is estimated, for using at each
In training sample under the classification of family, meet this feature set of properties pair for characteristic attribute each in this feature set of properties
The ratio of the training sample quantity of the pre-conditioned training sample quantity answered and this class of subscriber.Wherein, special
It can be multiple pre-conditioned for levying corresponding pre-conditioned of set of properties.
Such as: kth characteristic attribute group includes 2 characteristic attributes, monthly flow-A1And flow
APRU-A2, corresponding pre-conditioned of this feature set of properties has 4 kinds: (1) A1≤ 10, A2≤10;(2)
A1≤ 10, A2> 10;(3)A1> 10, A2≤10;(4)A1> 10, A2> 10.4G set meal user
Training sample data in, meet respectively 4G set meal number of users pre-conditioned in above-mentioned 4 be respectively
500,2500,1000,1000, then, in kth characteristic attribute group, each characteristic attribute meets the first
Time pre-conditioned, at class of subscriber C1The conditional probability of lower kth characteristic attribute group is estimated
P(Xk1|C1)=500/5000=0.1;In kth characteristic attribute group, to meet the second pre-conditioned for each characteristic attribute
Time, at class of subscriber C1The conditional probability of lower kth characteristic attribute group is estimated
P(Xk2|C1)=2500/5000=0.5, in kth characteristic attribute group, each characteristic attribute meets the third default bar
At class of subscriber C during part1The conditional probability of lower kth characteristic attribute group is estimated
P(Xk3|C1)=1000/5000=0.2;In kth characteristic attribute group, each characteristic attribute meets the 4th kind default article
At class of subscriber C during part1The conditional probability of lower kth characteristic attribute group is estimated
P(Xk4|C1)=1000/5000=0.2.It is similar to, it may be determined that each characteristic attribute in kth characteristic attribute group
When meeting pre-conditioned in above-mentioned 4 respectively, at class of subscriber C2The condition of lower kth characteristic attribute group is general
Rate is estimated.
Use above-mentioned identical mode, it may be determined that it is corresponding that other each characteristic attribute group meets this feature set of properties
Pre-conditioned time, under each class of subscriber this feature set of properties conditional probability estimate, the bar determined
Part probability Estimation is the characteristic attribute group prior probability to each class of subscriber, is also equivalent to, by right
Training sample data use the mode of step 201-step 203 to be trained generating grader.
Step 204, employing Bayesian formula, the probability and the condition that occur according to each class of subscriber determined are general
Rate is estimated, determines the targeted customer to be sorted posterior probability in each classification.
In this step, equation below is used to determine the targeted customer to be sorted posterior probability in each classification:
Wherein, CiFor i-th class of subscriber, 1≤i≤m, m are the total quantity of class of subscriber, P (Xkj|Ci)
Each characteristic attribute of expression kth characteristic attribute group is when pre-conditioned j, at class of subscriber CiLower kth
The conditional probability of characteristic attribute group is estimated, n is characterized the quantity of set of properties, and r is pre-conditioned number,
P(Ci) represent class of subscriber CiThe probability occurred, and P (X | Ci) represent that targeted customer X to be sorted is at class of subscriber
CiPosterior probability.
Step 205, determine the posterior probability of maximum whether more than the risk control coefficient preset, if it is,
Enter step 206, if it does not, enter step 207.
Wherein, the risk control coefficient preset can be arranged flexibly according to practical situation.
Step 206, the classification of this targeted customer to be sorted that classification corresponding for posterior probability maximum is defined as.
Step 207, give up this targeted customer to be distinguished classification judge.
In the embodiment of the present invention, owing to when carrying out marketing service, needing to push away to different classes of targeted customer
Sending the service that the category is corresponding, even if the classification that the posterior probability determining maximum is corresponding, the category is corresponding
Service it could also be possible that this targeted customer to be sorted is not intended to pushed, the risk control coefficient therefore preset
It is used for judging the degree of risk that the category has, if the posterior probability of maximum is not more than this risk control system
Number, then it is assumed that the classification of this targeted customer to be sorted is risky, and this classification is also inaccurate, gives up
The classification of this targeted customer to be distinguished judges, follow-up no longer to this targeted customer's Push Service to be sorted.
The method provided by the embodiment of the present invention 1, will have the characteristic attribute composition characteristic attribute of dependency
Between group, and characteristic attribute group separate, meet the separate vacation of each parameter using bayes method
If condition, when therefore targeted customer being classified, improve the accuracy of targeted customer's classification.Further, will be
Big posterior probability is not more than the targeted customer to be sorted of the risk control coefficient preset and gives up, and reduces marketing
Risk, can improve marketing success rate.
Embodiment 2:
Based on same inventive concept, the targeted customer's sorting technique provided according to the above embodiment of the present invention, phase
Ying Di, the embodiment of the present invention 2 additionally provides a kind of targeted customer's sorter, its structural representation such as Fig. 3
Shown in, specifically include:
First determines unit 301, for determining the probability of each class of subscriber in training sample, and each
Under class of subscriber, the conditional probability of each characteristic attribute group is estimated, the probability of described class of subscriber is this class of subscriber
The quantity of lower training sample and the ratio of training sample total quantity, characteristic attribute group under described each class of subscriber
Conditional probability be estimated as in the training sample under this class of subscriber, each characteristic attribute in this feature set of properties
Meet the training sample of pre-conditioned training sample quantity corresponding to this feature set of properties and this class of subscriber
The ratio of quantity;Described characteristic attribute group includes that extracts in all characteristic attributes of described training sample has
Between the characteristic attribute of dependency, and each characteristic attribute group separate, described characteristic attribute characterize training sample
The feature of notebook data;
Second determines unit 302, is used for using Bayesian formula, according to the probability of each class of subscriber determined
The conditional probability of each characteristic attribute group is estimated with under each user, determines that targeted customer to be sorted is in each classification
Posterior probability;
3rd determines unit 303, for classification corresponding for posterior probability maximum is defined as described mesh to be sorted
The class of subscriber of mark user.
Further, the described each characteristic attribute determined in each characteristic attribute group meets this feature set of properties correspondence
Pre-conditioned time instruction under the conditional probability of each class of subscriber is estimated as at each class of subscriber
Practice in sample data, meet this feature set of properties for each characteristic attribute in each characteristic attribute group corresponding
The ratio of the training sample quantity under pre-conditioned training sample quantity and described class of subscriber.
Further, second determines unit 302, specifically for using equation below to determine described mesh to be sorted
Mark user is in the posterior probability of each classification:
Wherein, CiFor i-th class of subscriber, 1≤i≤m, m are the total quantity of class of subscriber, P (Xkj|Ci)
Each characteristic attribute of expression kth characteristic attribute group is when pre-conditioned j, at class of subscriber CiLower kth
The conditional probability of characteristic attribute group is estimated, n is characterized the quantity of set of properties, and r is pre-conditioned number,
P(Ci) represent class of subscriber CiThe probability occurred, and P (X | Ci) represent that targeted customer X to be sorted is at class of subscriber
CiPosterior probability.
Further, said apparatus, also include:
Comparing unit 304, for being defined as described target to be sorted by classification corresponding for posterior probability maximum
Before the classification of user, the maximum posterior probability determined is compared with the risk control coefficient preset,
And determine that the posterior probability of described maximum is more than the risk control coefficient preset.
Further, said apparatus, also include:
Give up unit 305, for being not more than, when the posterior probability determining described maximum, the risk control system preset
During number, give up the classification to described targeted customer to be sorted and judge.
The embodiment of the present invention 2 additionally provides a kind of targeted customer's categorizing system, including:
Above-mentioned targeted customer's sorter that the embodiment of the present invention 2 provides.
The function of above-mentioned each unit may correspond to the respective handling step in flow process shown in Fig. 1 or Fig. 2, at this
Repeat no more.
In sum, the scheme that the embodiment of the present invention provides, comprise determining that each class of subscriber in training sample
Probability, and the probability of each characteristic attribute group under each class of subscriber, the probability of this class of subscriber was for should
The quantity of training sample and the ratio of training sample total quantity, feature under this each class of subscriber under class of subscriber
The probability of set of properties is in the training sample under this class of subscriber, and in this feature set of properties, each characteristic attribute is full
The number of training of foot pre-conditioned training sample quantity corresponding to this feature set of properties and this class of subscriber
The ratio of amount, what this feature set of properties included extracting in all characteristic attributes of training sample has dependency
Between characteristic attribute, and each characteristic attribute group separate, the spy of this feature attribute characterization training sample data
Point;Use Bayesian formula, according to each characteristic attribute under the probability of each class of subscriber determined and each user
The conditional probability of group is estimated, determines the targeted customer to be sorted posterior probability in each classification;By posterior probability
Maximum corresponding classification is defined as the class of subscriber of described targeted customer to be sorted.Use the embodiment of the present invention
Scheme, improves the accuracy of targeted customer's classification.
Targeted customer's sorter that embodiments herein is provided can be realized by computer program.Ability
Field technique personnel are it should be appreciated that above-mentioned Module Division mode is only in numerous Module Division mode
Kind, if being divided into other modules or not dividing module, as long as targeted customer's sorter has above-mentioned functions,
All should be within the protection domain of the application.
The application is with reference to method, equipment (system) and the computer program product according to the embodiment of the present application
The flow chart of product and/or block diagram describe.It should be understood that can by computer program instructions flowchart and
/ or block diagram in each flow process and/or flow process in square frame and flow chart and/or block diagram and/
Or the combination of square frame.These computer program instructions can be provided to general purpose computer, special-purpose computer, embedding
The processor of formula datatron or other programmable data processing device is to produce a machine so that by calculating
The instruction that the processor of machine or other programmable data processing device performs produces for realizing at flow chart one
The device of the function specified in individual flow process or multiple flow process and/or one square frame of block diagram or multiple square frame.
These computer program instructions may be alternatively stored in and computer or the process of other programmable datas can be guided to set
In the standby computer-readable memory worked in a specific way so that be stored in this computer-readable memory
Instruction produce and include the manufacture of command device, this command device realizes in one flow process or multiple of flow chart
The function specified in flow process and/or one square frame of block diagram or multiple square frame.
These computer program instructions also can be loaded in computer or other programmable data processing device, makes
Sequence of operations step must be performed to produce computer implemented place on computer or other programmable devices
Reason, thus the instruction performed on computer or other programmable devices provides for realizing flow chart one
The step of the function specified in flow process or multiple flow process and/or one square frame of block diagram or multiple square frame.
Obviously, those skilled in the art can carry out various change and modification without deviating from this to the present invention
Bright spirit and scope.So, if the present invention these amendment and modification belong to the claims in the present invention and
Within the scope of its equivalent technologies, then the present invention is also intended to comprise these change and modification.
Claims (9)
1. targeted customer's sorting technique, it is characterised in that including:
Determine the probability of each class of subscriber in training sample, and each characteristic attribute group under each class of subscriber
Conditional probability estimate, the probability of described class of subscriber is quantity and the training of training sample under this class of subscriber
The ratio of total sample number amount, under described each class of subscriber, the conditional probability of characteristic attribute group is estimated as in this use
In training sample under the classification of family, in this feature set of properties, to meet this feature set of properties corresponding for each characteristic attribute
The ratio of the training sample quantity of pre-conditioned training sample quantity and this class of subscriber;Described characteristic attribute
Group includes the characteristic attribute with dependency extracted in all characteristic attributes of described training sample, and each spy
Levying between set of properties separate, described characteristic attribute characterizes the feature of training sample data;
Use Bayesian formula, according to each characteristic attribute under the probability of each class of subscriber determined and each user
The conditional probability of group is estimated, determines the targeted customer to be sorted posterior probability in each classification;
Classification corresponding for posterior probability maximum is defined as the class of subscriber of described targeted customer to be sorted.
2. the method for claim 1, it is characterised in that described targeted customer to be sorted is each
The posterior probability of classification uses equation below to determine:
Wherein, CiFor i-th class of subscriber, 1≤i≤m, m are the total quantity of class of subscriber, P (Xkj|Ci)
Each characteristic attribute of expression kth characteristic attribute group is when pre-conditioned j, at class of subscriber CiLower kth
The conditional probability of characteristic attribute group is estimated, n is characterized the quantity of set of properties, and r is pre-conditioned number,
P(Ci) represent class of subscriber CiThe probability occurred, and P (X | Ci) represent that targeted customer X to be sorted is at class of subscriber
CiPosterior probability.
3. the method for claim 1, it is characterised in that by class corresponding for posterior probability maximum
Before not being defined as the classification of described targeted customer to be sorted, also include:
The maximum posterior probability determined is compared with default risk control coefficient, and described in determining
Big posterior probability is more than the risk control coefficient preset.
4. method as claimed in claim 3, it is characterised in that also include:
When the posterior probability determining described maximum is not more than the risk control coefficient preset, gives up and treat described
The classification of class object user judges.
5. targeted customer's sorter, it is characterised in that including:
First determines unit, for determining the probability of each class of subscriber in training sample, and each user
Under classification, the conditional probability of each characteristic attribute group is estimated, the probability of described class of subscriber is to instruct under this class of subscriber
Practice the quantity of sample and the ratio of training sample total quantity, the bar of characteristic attribute group under described each class of subscriber
Part probability Estimation is in the training sample under this class of subscriber, and in this feature set of properties, each characteristic attribute meets
The training sample quantity of pre-conditioned training sample quantity corresponding to this feature set of properties and this class of subscriber
Ratio;Having that described characteristic attribute group includes extracting in all characteristic attributes of described training sample is relevant
Property characteristic attribute, and separate between each characteristic attribute group, described characteristic attribute characterizes number of training
According to feature;
Second determines unit, is used for using Bayesian formula, according to the probability of each class of subscriber determined and every
Under individual user, the conditional probability of each characteristic attribute group is estimated, determines that targeted customer to be sorted is after each classification
Test probability;
3rd determines unit, uses for classification corresponding for posterior probability maximum is defined as described target to be sorted
The class of subscriber at family.
6. device as claimed in claim 5, it is characterised in that described second determines unit, specifically uses
The described targeted customer to be sorted posterior probability in each classification is determined in using equation below:
Wherein, CiFor i-th class of subscriber, 1≤i≤m, m are the total quantity of class of subscriber, P (Xkj|Ci)
Each characteristic attribute of expression kth characteristic attribute group is when pre-conditioned j, at class of subscriber CiLower kth
The conditional probability of characteristic attribute group is estimated, n is characterized the quantity of set of properties, and r is pre-conditioned number,
P(Ci) represent class of subscriber CiThe probability occurred, and P (X | Ci) represent that targeted customer X to be sorted is at class of subscriber
CiPosterior probability.
7. device as claimed in claim 5, it is characterised in that also include:
Comparing unit, for being defined as described targeted customer to be sorted by classification corresponding for posterior probability maximum
Classification before, the maximum posterior probability determined is compared with default risk control coefficient, and really
The posterior probability of fixed described maximum is more than the risk control coefficient preset.
8. device as claimed in claim 7, it is characterised in that also include:
Give up unit, for being not more than, when the posterior probability determining described maximum, the risk control coefficient preset
Time, give up the classification to described targeted customer to be sorted and judge.
9. targeted customer's categorizing system, it is characterised in that including:
Device as described in claim 5-8 is arbitrary.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510219456.1A CN106204083B (en) | 2015-04-30 | 2015-04-30 | Target user classification method, device and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510219456.1A CN106204083B (en) | 2015-04-30 | 2015-04-30 | Target user classification method, device and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106204083A true CN106204083A (en) | 2016-12-07 |
CN106204083B CN106204083B (en) | 2020-02-18 |
Family
ID=57458538
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510219456.1A Active CN106204083B (en) | 2015-04-30 | 2015-04-30 | Target user classification method, device and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106204083B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108229989A (en) * | 2016-12-14 | 2018-06-29 | 北京国双科技有限公司 | The Attribute class method for distinguishing and device of a kind of determining user property |
WO2019114481A1 (en) * | 2017-12-13 | 2019-06-20 | 腾讯科技(深圳)有限公司 | Cluster type recognition method, apparatus, electronic apparatus, and storage medium |
CN109962956A (en) * | 2017-12-26 | 2019-07-02 | 中国电信股份有限公司 | For recommending the method and system of communication service to user |
CN110442722A (en) * | 2019-08-13 | 2019-11-12 | 北京金山数字娱乐科技有限公司 | Method and device for training classification model and method and device for data classification |
CN110580483A (en) * | 2018-05-21 | 2019-12-17 | 上海大唐移动通信设备有限公司 | indoor and outdoor user distinguishing method and device |
CN111324641A (en) * | 2020-02-19 | 2020-06-23 | 腾讯科技(深圳)有限公司 | Personnel estimation method and device, computer-readable storage medium and terminal equipment |
CN111797942A (en) * | 2020-07-23 | 2020-10-20 | 深圳壹账通智能科技有限公司 | User information classification method and device, computer equipment and storage medium |
CN113111284A (en) * | 2021-04-12 | 2021-07-13 | 中国铁塔股份有限公司 | Classification information display method and device, electronic equipment and readable storage medium |
CN113591018A (en) * | 2021-07-30 | 2021-11-02 | 中国联合网络通信集团有限公司 | Communication client classification management method, system, electronic device and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101577866A (en) * | 2008-05-07 | 2009-11-11 | 中国移动通信集团公司 | User classification method, advertisement release method and device |
CN101685458A (en) * | 2008-09-27 | 2010-03-31 | 华为技术有限公司 | Recommendation method and system based on collaborative filtering |
CN102081655A (en) * | 2011-01-11 | 2011-06-01 | 华北电力大学 | Information retrieval method based on Bayesian classification algorithm |
US20110178964A1 (en) * | 2010-01-21 | 2011-07-21 | National Cheng Kung University | Recommendation System Using Rough-Set and Multiple Features Mining Integrally and Method Thereof |
CN103778206A (en) * | 2014-01-14 | 2014-05-07 | 河南科技大学 | Method for providing network service resources |
CN104281635A (en) * | 2014-03-13 | 2015-01-14 | 电子科技大学 | Method for predicting basic attributes of mobile user based on privacy feedback |
CN104298719A (en) * | 2014-09-23 | 2015-01-21 | 新浪网技术(中国)有限公司 | Method and system for conducting user category classification and advertisement putting based on social behavior |
-
2015
- 2015-04-30 CN CN201510219456.1A patent/CN106204083B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101577866A (en) * | 2008-05-07 | 2009-11-11 | 中国移动通信集团公司 | User classification method, advertisement release method and device |
CN101685458A (en) * | 2008-09-27 | 2010-03-31 | 华为技术有限公司 | Recommendation method and system based on collaborative filtering |
US20110178964A1 (en) * | 2010-01-21 | 2011-07-21 | National Cheng Kung University | Recommendation System Using Rough-Set and Multiple Features Mining Integrally and Method Thereof |
CN102081655A (en) * | 2011-01-11 | 2011-06-01 | 华北电力大学 | Information retrieval method based on Bayesian classification algorithm |
CN103778206A (en) * | 2014-01-14 | 2014-05-07 | 河南科技大学 | Method for providing network service resources |
CN104281635A (en) * | 2014-03-13 | 2015-01-14 | 电子科技大学 | Method for predicting basic attributes of mobile user based on privacy feedback |
CN104298719A (en) * | 2014-09-23 | 2015-01-21 | 新浪网技术(中国)有限公司 | Method and system for conducting user category classification and advertisement putting based on social behavior |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108229989B (en) * | 2016-12-14 | 2020-09-22 | 北京国双科技有限公司 | Method and device for determining attribute category of user attribute |
CN108229989A (en) * | 2016-12-14 | 2018-06-29 | 北京国双科技有限公司 | The Attribute class method for distinguishing and device of a kind of determining user property |
WO2019114481A1 (en) * | 2017-12-13 | 2019-06-20 | 腾讯科技(深圳)有限公司 | Cluster type recognition method, apparatus, electronic apparatus, and storage medium |
CN109962956A (en) * | 2017-12-26 | 2019-07-02 | 中国电信股份有限公司 | For recommending the method and system of communication service to user |
CN109962956B (en) * | 2017-12-26 | 2022-06-07 | 中国电信股份有限公司 | Method and system for recommending communication services to a user |
CN110580483A (en) * | 2018-05-21 | 2019-12-17 | 上海大唐移动通信设备有限公司 | indoor and outdoor user distinguishing method and device |
CN110442722A (en) * | 2019-08-13 | 2019-11-12 | 北京金山数字娱乐科技有限公司 | Method and device for training classification model and method and device for data classification |
CN110442722B (en) * | 2019-08-13 | 2022-05-13 | 北京金山数字娱乐科技有限公司 | Method and device for training classification model and method and device for data classification |
CN111324641A (en) * | 2020-02-19 | 2020-06-23 | 腾讯科技(深圳)有限公司 | Personnel estimation method and device, computer-readable storage medium and terminal equipment |
CN111324641B (en) * | 2020-02-19 | 2022-09-09 | 腾讯科技(深圳)有限公司 | Personnel estimation method and device, computer-readable storage medium and terminal equipment |
CN111797942A (en) * | 2020-07-23 | 2020-10-20 | 深圳壹账通智能科技有限公司 | User information classification method and device, computer equipment and storage medium |
CN113111284A (en) * | 2021-04-12 | 2021-07-13 | 中国铁塔股份有限公司 | Classification information display method and device, electronic equipment and readable storage medium |
CN113591018A (en) * | 2021-07-30 | 2021-11-02 | 中国联合网络通信集团有限公司 | Communication client classification management method, system, electronic device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN106204083B (en) | 2020-02-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106204083A (en) | A kind of targeted customer's sorting technique, Apparatus and system | |
CN109872162B (en) | Wind control classification and identification method and system for processing user complaint information | |
WO2022126963A1 (en) | Customer profiling method based on customer response corpora, and device related thereto | |
CN105468742A (en) | Malicious order recognition method and device | |
CN111192004A (en) | Method for displaying current to-do task and subsequent to-do workflow | |
Nassif et al. | Regression model for software effort estimation based on the use case point method | |
CN110516057B (en) | Petition question answering method and device | |
CN105912645A (en) | Intelligent question and answer method and apparatus | |
CN109784368A (en) | A kind of determination method and apparatus of application program classification | |
CN105812554A (en) | Method and system for intelligently managing text messages in mobile phones | |
CN105894183A (en) | Project evaluation method and apparatus | |
CN110609908A (en) | Case serial-parallel method and device | |
CN106204091A (en) | Data processing method and device | |
CN105260913A (en) | CTR estimation method and system, and DSP server used for Internet advertisement putting | |
CN106651368A (en) | Order-scalping-preventing payment mode control method and control system | |
CN107015993B (en) | User type identification method and device | |
CN107291774B (en) | Error sample identification method and device | |
CN109918645A (en) | Method, apparatus, computer equipment and the storage medium of depth analysis text | |
CN113887551B (en) | Target person analysis method based on ticket data, terminal device and storage medium | |
CN109145932A (en) | User's gender prediction's method, device and equipment | |
CN111061948A (en) | User label recommendation method and device, computer equipment and storage medium | |
CN109657929A (en) | Appraisal procedure, device and the computer equipment of trade mark registration percent of pass | |
CN107016460A (en) | User changes planes Forecasting Methodology and device | |
CN103778210B (en) | Method and device for judging specific file type of file to be analyzed | |
CN112256844A (en) | Text classification method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |