CN105447050B - The treating method and apparatus of client segmentation - Google Patents

The treating method and apparatus of client segmentation Download PDF

Info

Publication number
CN105447050B
CN105447050B CN201410468888.1A CN201410468888A CN105447050B CN 105447050 B CN105447050 B CN 105447050B CN 201410468888 A CN201410468888 A CN 201410468888A CN 105447050 B CN105447050 B CN 105447050B
Authority
CN
China
Prior art keywords
discriminator
client
classification
uncertainty value
symmetrical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410468888.1A
Other languages
Chinese (zh)
Other versions
CN105447050A (en
Inventor
刘恒
刘超凡
韩志坚
曾泽欢
周刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen TCL New Technology Co Ltd
Original Assignee
Shenzhen TCL New Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen TCL New Technology Co Ltd filed Critical Shenzhen TCL New Technology Co Ltd
Priority to CN201410468888.1A priority Critical patent/CN105447050B/en
Publication of CN105447050A publication Critical patent/CN105447050A/en
Application granted granted Critical
Publication of CN105447050B publication Critical patent/CN105447050B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of processing methods of client segmentation, comprising: establishes practical client model, determines client's classification;Establish the discriminator for judging client characteristics attribute;Calculate the frequency of other discriminators under the conditions of the corresponding frequency of the corresponding each characteristic attribute of the other frequency of every kind of customer class, every kind of client's classification, each discriminator;Symmetrical uncertainty value between calculating every two discriminator, between each discriminator and client's classification;Discriminator to be deleted is searched, discriminator delete operation is executed, acquires customer data, Naive Bayes Classification is carried out to client by the discriminator of reservation.The invention also discloses the processing units of client segmentation.Method and apparatus provided by the invention, reduce the degree of correlation and redundancy between discriminator, mutual independence is strong, in client segmentation calculating, since discriminator is reduced, so as to reduce the calculation amount and computation complexity of client segmentation, since the mutual independence of discriminator is strong, the accuracy rate that client segmentation calculates is higher.

Description

The treating method and apparatus of client segmentation
Technical field
The present invention relates to sorting algorithm technical field more particularly to the treating method and apparatus of client segmentation.
Background technique
It is frequently encountered such case in product sales process: needing to judge that the client is according to customer communication result No is potential customers, if then needing through sales technique come a possibility that improving conclusion of the business, to reach raising sales volume.It is wherein above-mentioned Deterministic process belongs to the process of a client segmentation.
All more or less it is related to the event of some client segmentations, the common mode of the prior art in various technical fields It is to initially set up a client model, is then classified using Naive Bayes Classification Algorithm to customer type.But it is existing Naive Bayes Classification Algorithm there are the characteristics that discriminator irrelevance and redundancy, algorithm complexity is higher, simultaneously be easy Cause client segmentation inaccurate.
Summary of the invention
It is a primary object of the present invention to solve, traditional Naive Bayes Classification Algorithm complexity is high, client segmentation is not smart enough True technical problem.
To achieve the above object, the processing method of a kind of client segmentation provided by the invention, comprising:
Step S1, practical client model is established, determines client's classification C in the practical client model;
Step S2, the discriminator for judging client characteristics attribute is established;
Step S3, it is corresponding that the frequency of every kind of client's classification C, every kind of client's classification C in the practical client model are calculated The frequency of other discriminators under the conditions of each corresponding frequency of characteristic attribute and each discriminator;
Step S4, according to step S3 calculate resulting data calculate between every two discriminator, each discriminator and client Symmetrical uncertainty value between classification C;
Step S5, discriminator to be deleted is searched according to step S4 calculated result, and executes the deletion of discriminator to be deleted Operation, retains the strong discriminator of mutual independence;
Step S6, customer data is acquired, the discriminator retained by step S5 is according to the customer data to the client Carry out Naive Bayes Classification.
Preferably, described that discriminator to be deleted is searched according to calculated result, and execute the deletion behaviour of discriminator to be deleted It specifically includes:
Step S51, the maximum discriminator Ap of symmetrical uncertainty value between lookup and client's classification C;
Step S52, it deletes and is greater than the maximum symmetrical uncertainty the symmetrical uncertainty value between other discriminators Value SUP, CDiscriminator, retain other discriminators.
Preferably, the step S52 specifically: circulation execute when different from Ap a discriminator Aq with it is at least one other Symmetrical uncertainty value SU between discriminatorI, qMore than or equal to the maximum symmetrical uncertainty value SUP, C, then described in deletion The operation of discriminator Aq, until symmetrical uncertainty value is respectively less than the maximum symmetrical uncertainty value SU between discriminatorP, C
Preferably, before the step S51 further include:
Step S511, the symmetrical uncertainty value between each discriminator and client's classification is not known symmetrically with preset Threshold value is compared;
Step S512, the symmetrical uncertainty value between deletion and client's classification is greater than or equal to the symmetrically uncertain threshold The discriminator of value.
Preferably, the discriminator of the foundation in the step S2 be at least 3, each discriminator include at least two clients Characteristic attribute.
In addition, to achieve the above object, the present invention also provides a kind of processing units of client segmentation, comprising:
Model building module determines client's classification C in the practical client model for establishing practical client model;
Discriminator establishes module, for establishing the discriminator for judging client characteristics attribute;
Computing module, for calculating the frequency of every kind of client's classification C in the practical client model, every kind of client's classification C The frequency of other discriminators under the conditions of the corresponding corresponding frequency of each characteristic attribute and each discriminator;
Discriminator filtering module calculates between every two discriminator, often for calculating resulting data according to computing module Symmetrical uncertainty value between a discriminator and client's classification C, and discriminator to be deleted is searched according to calculated result, and hold The delete operation of row discriminator to be deleted, retains reasonable discriminator;
Client segmentation module, for acquiring customer data, by the discriminator of reservation according to the customer data to described Client carries out Naive Bayes Classification.
Preferably, the discriminator filtering module, comprising:
Searching unit, for searching the maximum discriminator Ap of symmetrical uncertainty value between client's classification C;
First deletes unit, is greater than the maximum symmetrically for deleting the symmetrical uncertainty value between other discriminators Uncertainty value SUP, CDiscriminator, retain other discriminators.
Preferably, the first deletion unit, specifically for circulation execution as the discriminator Aq different from Ap and at least Symmetrical uncertainty value SU between one other discriminatorI, qMore than or equal to the maximum symmetrical uncertainty value SUP, C, then The operation of the discriminator Aq is deleted, until symmetrical uncertainty value is respectively less than the maximum symmetrical uncertainty between discriminator Value SUP, C
Preferably, the discriminator filtering module further include:
Comparing unit, for searching the maximum identification of symmetrical uncertainty value between client's classification C in searching unit Before device Ap, by the symmetrical uncertainty value and preset symmetrically uncertain threshold value progress between each discriminator and client's classification Compare;
Second deletes unit, symmetrical more than or equal to described for deleting the symmetrical uncertainty value between client's classification The discriminator of uncertain threshold value.
Preferably, the discriminator of the foundation be at least 3, each discriminator include at least two client characteristics attributes.
The treating method and apparatus of client segmentation provided by the present invention determines the reality by establishing practical client model Client's classification C in the client model of border;Establish the discriminator for judging client characteristics attribute;It calculates in the practical client model The frequency of each client's classification C, the corresponding each corresponding frequency of characteristic attribute of every kind of client's classification C and each discriminator Under the conditions of other discriminators frequency;According to calculate resulting data calculate between every two discriminator, each discriminator and visitor Symmetrical uncertainty value between the classification C of family;Discriminator to be deleted is searched according to calculated result, and executes discriminator to be deleted Delete operation mode, retain the strong discriminator of mutual independence, acquire customer data, pass through the discriminator root of reservation The mode for carrying out Naive Bayes Classification to the client according to the customer data, reduces the mutual degree of correlation of discriminator And redundancy, mutual independence is strong, in customer type classified calculating, since discriminator quantity is reduced, so as to subtract Calculation amount and computation complexity in few client segmentation calculating process, simultaneously because the mutual independence of discriminator is strong, Ke Hufen The accuracy rate that class calculates is higher.
Detailed description of the invention
Fig. 1 is the flow chart of one embodiment of method of client segmentation of the invention;
Fig. 2 is the specific refined flow chart of step S5 in Fig. 1;
Fig. 3 is another refined flow chart of step S5 in Fig. 1;
Fig. 4 is the functional block diagram of one embodiment of processing unit of client segmentation of the invention;
Fig. 5 is the refinement the functional block diagram of discriminator filtering module in Fig. 4;
Fig. 6 is another refinement the functional block diagram of discriminator filtering module in Fig. 4.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The present invention provides a kind of method of client segmentation.It is one reality of method of client segmentation of the invention referring to Fig. 1, Fig. 1 Apply the flow chart of example.In one embodiment, the method for the client segmentation includes:
Step S10, practical Customer Classifying Model is established, determines client's classification C in the practical Customer Classifying Model.
The size for the practical Customer Classifying Model established in this step can be adjusted according to actual needs, such as can be with Establish the practical Customer Classifying Model that client's number is 1000.In the practical Customer Classifying Model of the foundation, record has every The related data information of a client provides data support for the calculating of subsequent step.
Step S20, the discriminator for judging client characteristics attribute is established.
In this step S20, the discriminator of the foundation is at least 3, and each discriminator of the foundation includes at least two A client characteristics attribute.
Step S30, it is corresponding that the frequency of every kind of client's classification C, every kind of client's classification C in the practical client model are calculated The frequency of other discriminators under the conditions of each corresponding frequency of characteristic attribute and each discriminator.
Step S40, according to step S3 calculate resulting data calculate between every two discriminator, each discriminator and client Symmetrical uncertainty value between classification C.
Step S50, discriminator to be deleted is searched according to step S40 calculated result, and executes deleting for discriminator to be deleted Except operation, retain the strong discriminator of mutual independence.
Step S60, customer data is acquired, Piao is carried out to the client according to the customer data by the discriminator of reservation Plain Bayes's classification.
Referring to fig. 2, Fig. 2 is the specific refined flow chart of step S50 in Fig. 1.The step S50 is specifically included:
Step S501, the maximum discriminator Ap of symmetrical uncertainty value between lookup and client's classification C.
Step S502, it deletes and is greater than the maximum symmetrical uncertainty the symmetrical uncertainty value between other discriminators Value SUP, CDiscriminator, retain other discriminators.
In the present embodiment, symmetrical uncertainty value refers to degree of correlation or discriminator between discriminator and discriminator Degree of correlation between class, value range are [0,1].Wherein the smaller degree of correlation for showing the two of value is bigger, value It is more big, show that the degree of correlation of the two is small, mutual independence is strong.Such as it is symmetrical not true between discriminator 1 and discriminator 2 When qualitative value is 0, expression is independent from each other therebetween.By searching for the symmetrical uncertainty value between client's classification C Maximum discriminator Ap, the symmetrical uncertainty value deleted between other discriminators are greater than the symmetrical uncertainty value of maximum SUP, CDiscriminator, be that the discriminator in order to will be big with other discriminator correlations, with redundancy filters out, determination remains The mutual independence of discriminator it is strong, to improve the accuracy rate of subsequent clients classification, while reducing subsequent clients classified calculating Complexity and calculation amount.
In the present embodiment, the step S502 concrete operations are as follows: circulation execute when different from Ap a discriminator Aq with Symmetrical uncertainty value SU between at least one other discriminatorI, qMore than or equal to the symmetrical uncertainty value of maximum SUP, C, then the operation of the discriminator Aq is deleted, until symmetrical uncertainty value is respectively less than the maximum symmetrically between discriminator Uncertainty value SUP, C.Such as: setting foundation has discriminator A1, A2, A3, A4, A5, A6, A7, A8.Wherein A6 and client's classification C1 Between symmetrical uncertainty value SUA6, C1Maximum, the symmetrical uncertainty value SU between A1 and A2A1, A2Greater than SUA6, C1, at this time A1 or A2 can be deleted.If deleting A1, judge in remaining A2, A3, A4, A5, A6, A7, A8, other mirror in addition to A6 It is greater than or equal to the SU with the presence or absence of symmetrical uncertainty value between the two in other deviceA6, C1Discriminator.If A2 and A5 it Between symmetrical uncertainty value SUA2, A5Greater than SUA6, C1, A2 or A5 can be deleted at this time.If deleting A5, judge remaining It is big with the presence or absence of symmetrical uncertainty value between the two in other discriminators in addition to A6 in A2, A3, A4, A6, A7, A8 In or equal to the SUA6, C1Discriminator, it is subsequent that the rest may be inferred, until between discriminator symmetrical uncertainty value be respectively less than it is described Maximum symmetrical uncertainty value SUA6, C1
It is another refined flow chart of step S50 in Fig. 1 referring to Fig. 3, Fig. 3.In above-described embodiment, in discriminator quantity mistake Can also include following processing before above-mentioned steps S501 when big:
Step S5011, by between each discriminator and client's classification symmetrical uncertainty value and it is preset it is symmetrical not really Determine threshold value to be compared.
Step S5012, the symmetrical uncertainty value between deletion and client's classification is greater than or equal to described symmetrical uncertain The discriminator of threshold value.
In this implementation, the symmetrically uncertain threshold value is according to practical client model, using neural network algorithm to experience Value is trained to predict optimum value.It is greater than or equal to institute by the symmetrical uncertainty value deleted between client's classification The discriminator for stating symmetrical uncertain threshold value, is the correlation between the discriminator and classification in order to ensure remaining, thus really Protect the accuracy of client segmentation.
The processing method embodiment of client segmentation provided by the present invention, reduce the mutual degree of correlation of discriminator and Redundancy, mutual independence is strong, in subsequent clients classified calculating, since discriminator quantity is reduced, so as to reduce Calculation amount and computation complexity in client segmentation calculating process, simultaneously because the mutual independence of discriminator is strong, client segmentation The accuracy rate of calculating is higher.
The present invention further provides the processing units of client segmentation, and referring to fig. 4, Fig. 4 is the place of client segmentation of the invention Manage the functional block diagram of one embodiment of device.In one embodiment, the processing unit 100 includes: model building module 110, discriminator establishes module 120, computing module 130, discriminator filtering module 140, client segmentation module 150.The model Module 110 is established, for establishing practical client model, determines client's classification C in the practical client model.The discriminator is established Module 120, for establishing the discriminator for judging client characteristics attribute.The computing module 130, for calculating the reality In client model the corresponding frequency of the corresponding each characteristic attribute of the frequency of every kind of client's classification C, every kind of client's classification C and The frequency of other discriminators under the conditions of each discriminator.The discriminator filtering module 140, for calculating institute according to computing module The data obtained calculate the symmetrical uncertainty value between every two discriminator, between each discriminator and client's classification C, and root Discriminator to be deleted is searched according to calculated result, and executes the delete operation of discriminator to be deleted, retains mutual independence Strong discriminator.The client segmentation module 150, for acquiring customer data, by the discriminator of reservation according to the client Data carry out Naive Bayes Classification to the client.
The size for the practical Customer Classifying Model established in the present embodiment example can be adjusted according to actual needs, such as It can establish the practical Customer Classifying Model that client's number is 1000.In the practical Customer Classifying Model of the foundation, record There is the related data information of each client, provides data support for the subsequent calculating operation of computing module 130.
In the present embodiment, the discriminator of the foundation be at least 3, each discriminator include at least two client characteristics categories Property.
It is the refinement the functional block diagram of discriminator filtering module in Fig. 4 referring to Fig. 5, Fig. 5.The discriminator filter module Block 140, comprising: searching unit 141, first deletes unit 142.Wherein, the searching unit 141, for lookup and customer class The maximum discriminator Ap of symmetrical uncertainty value between other C.Described first deletes unit 142, for deleting and other identifications Symmetrical uncertainty value between device is greater than the maximum symmetrical uncertainty value SUP, CDiscriminator, retain other discriminators. In the present embodiment, symmetrical uncertainty value refers between degree of correlation or discriminator and class between discriminator and discriminator Degree of correlation, value range is [0,1].Wherein the smaller degree of correlation for showing the two of value is bigger, the more big then table of value Both bright degree of correlation is small, and mutual independence is strong.Such as the symmetrical uncertainty value between discriminator 1 and discriminator 2 is 0 When, expression is independent from each other therebetween.By searching for the maximum identification of symmetrical uncertainty value between client's classification C Device Ap deletes the symmetrical uncertainty value between other discriminators and is greater than the maximum symmetrical uncertainty value SUP, CIdentification Device is that the discriminator in order to will be big with other discriminator correlations, with redundancy filters out, determines the discriminator phase remained Mutually between independence it is strong, with improve subsequent clients classification accuracy rate, while reduce subsequent clients classified calculating complexity and Calculation amount.
Further, it is described first delete unit 142, specifically for circulation execute when different from Ap a discriminator Aq with Symmetrical uncertainty value SU between at least one other discriminatorI, qMore than or equal to the symmetrical uncertainty value of maximum SUP, C, then the operation of the discriminator Aq is deleted, until symmetrical uncertainty value is respectively less than the maximum symmetrically between discriminator Uncertainty value SUP, C.Such as: setting foundation has discriminator A1, A2, A3, A4, A5, A6, A7, A8.Wherein A6 and client's classification C1 Between symmetrical uncertainty value SUA6, C1Maximum, the symmetrical uncertainty value SU between A1 and A2A1, A2Greater than SUA6, C1, at this time A1 or A2 can be deleted.If deleting A1, judge in remaining A2, A3, A4, A5, A6, A7, A8, other mirror in addition to A6 It is greater than or equal to the SU with the presence or absence of symmetrical uncertainty value between the two in other deviceA6, C1Discriminator.If A2 and A5 it Between symmetrical uncertainty value SUA2, A5Greater than SUA6, C1, A2 or A5 can be deleted at this time.If deleting A5, judge remaining It is big with the presence or absence of symmetrical uncertainty value between the two in other discriminators in addition to A6 in A2, A3, A4, A6, A7, A8 In or equal to the SUA6, C1Discriminator, it is subsequent that the rest may be inferred, until between discriminator symmetrical uncertainty value be respectively less than it is described Maximum symmetrical uncertainty value SUA6, C1
It is another refinement the functional block diagram of discriminator filtering module in Fig. 4 referring to Fig. 6, Fig. 6.The discriminator mistake Filter module 140 further include: comparing unit 143 and second deletes unit 144.Wherein, comparing unit 143, in searching unit Before 141 search the maximum discriminator Ap of symmetrical uncertainty value between client's classification C, by each discriminator and customer class Symmetrical uncertainty value between not is compared with preset symmetrically uncertain threshold value.Second deletes unit 144, for deleting Symmetrical uncertainty value between client's classification is greater than or equal to the discriminator of the symmetrical uncertain threshold value.In this implementation, The symmetrically uncertain threshold value is to be trained to predict to empirical value using neural network algorithm according to practical client model Optimum value out.It is greater than or equal to the symmetrical uncertain threshold value by the symmetrical uncertainty value deleted between client's classification Discriminator is the correlation between the discriminator and classification in order to ensure remaining, so that it is guaranteed that the accuracy of client segmentation.
100 embodiment of processing unit of client segmentation provided by the present invention, reduces the mutual correlation of discriminator Degree and redundancy, mutual independence is strong, in subsequent clients classified calculating, since discriminator quantity is reduced, so as to Calculation amount and computation complexity in client segmentation calculating process are reduced, simultaneously because the mutual independence of discriminator is strong, client The accuracy rate of classified calculating is higher.
The method of client segmentation of the invention is described in detail with next specific merchandise sales example.
The first step acquires customer data from merchandise sales sales field, and data collected are divided into two classes: the first kind is used to true Fixed reasonable discriminator;Second class is used for the verifying using reasonable discriminator to client segmentation effect.Algorithm is tested Card.
Second step establishes practical client model according to primary sources, client's classification is divided into three classes: C1=true objective Family, C2=potential customers, C3=non-customer.
Discriminator mainly classifies to client according to these attributes.
Third step establishes following four discriminators:
Discriminator A1: client is into shop number
Discriminator A2: whether client is into shop with household
Discriminator A3: whether client exchanges professional with sales force
Discriminator A4: whether client's dress is honorable
To above-mentioned discriminator A1、A2、A3、A4Carry out characteristic attribute division:
A1: { a1≤ 1,1 < a1< 3, a1>=3 }, a1Client is represented into shop number.
A2: { a2=0, a2=1 }, a2=0 represents client not band household, a2=1, which represents client, band household.
A3: { a3=0, a3=1 }, a3=0 unprofessional, a that represents customer communication3=1 represents customer communication profession.
A4: { a4=0, a4=1 }, a4=0 represents scandalous, a of dress4=1, which represents client, wears dignity.
4th step, using the following data of NB Algorithm:
1, in the sample of setting data volume, C1、C2、C3Frequency:
If in the practical client model established including C in 1000 client's samples1Actual customer has 332, C3Non-customer 565, C2Potential customers 103, then C1The frequency of actual customer: P (C1)=332/1000=0.332;C2The frequency of potential customers Rate: P (C2)=565/1000=0.565;C3The frequency of non-customer: P (C3)=103/1000=0.103.
2、C1、C2、C3Under the conditions of the corresponding frequency of each characteristic attribute: by condition probability formula P (A | B)=P (AB)/P (B) it obtains.
Such as C1Under the conditions of characteristic attribute a1≤ 1,1 < a1<3、a1The corresponding frequency in >=3:
P(a1≤ 1 | C1)=0.7, P (1 < a1<3|C1)=0.2, P (a1>=3 | C1)=0.1.
C1Under the conditions of the corresponding frequency of other characteristic attributes and C2、C3Under the conditions of the corresponding frequency of each characteristic attribute, Specific algorithm is same as above.
5th step, the frequency for calculating each discriminator under the conditions of each discriminator;
Such as P (a1≤ 1 | a2=0)=0.3, P (a1≤ 1 | a3=0)=0.4, P (a1≤ 1 | a4=0)=0.3
6th step, according to calculating resulting data in step 4, and according to formulaMeter Calculate the symmetrical uncertainty between the symmetrical uncertainty value and each discriminator and every kind of classification of discriminator between any two Value: Wherein, SU (A1,A2) represent Discriminator A1With discriminator A2Between symmetrical uncertainty value.SU(A1,C1) represent discriminator A1With actual customer classification C1It Between symmetrical uncertainty value.
The maximum symmetrical uncertainty value SU (A, C) between discriminator A and classification C is determined by above-mentioned calculating.Wherein Only discriminator A is calculated4With discriminator A3Between symmetrical uncertainty value be greater than SU (A, C), therefore identification can be deleted Device A4With discriminator A3One of those, by discriminator A in the present embodiment4It deletes, retains remaining discriminator A1、A2、A3
7th step, acquisition customer data, pass through discriminator A1、A2、A3To the data according to each client, to corresponding Client carries out Naive Bayes Classification.
Assuming that the data of a client are specific as follows: a1≤ 1, a2=0, a3=0, according to the first step to four-step calculation Data calculate the client as following three types of probability:
C1The probability of actual customer: P (C1)P(x|C1)=P (C1)P(a1≤ 1 | C1)P(a2=0 | C1)P(a3=0 | C1)= 0.153
C2The probability of potential customers: P (C2)P(x|C2)=P (C2)P(a1≤ 1 | C2)P(a2=0 | C2)P(a3=0 | C2)= 0.064
C3The probability of non-customer: P (C3)P(x|C3)=P (C3)P(a1≤ 1 | C3)P(a2=0 | C3)P(a3=0 | C3)= 0.216
From above-mentioned calculated result: the client becomes the maximum probability of non-customer, therefore should be divided into the client non- Customer class C3
Comparative example:
The first step is identical as above-mentioned specific example to the 4th step, and details are not described herein.
Assuming that collecting the data of a client are as follows: a1≤ 1, a2=0, a3=0, a4=0, according to the first step to the 4th step The data of calculating calculate the client as following three types of probability:
C1The probability of actual customer: P (C1)P(x|C1)=P (C1)P(a1≤ 1 | C1)P(a2=0 | C1)P(a3=0 | C1)P (a4=0 | C1)=0.0212
C2The probability of potential customers: P (C2)P(x|C2)=P (C2)P(a1≤ 1 | C2)P(a2=0 | C2)P(a3=0 | C2)P (a4=0 | C2)=0.0092
C3The probability of non-customer: P (C3)P(x|C3)=P (C3)P(a1≤ 1 | C3)P(a2=0 | C3)P(a3=0 | C3)P(a4 =0 | C3)=0.0652
From above-mentioned calculated result: as the maximum probability of non-customer, therefore the client should be divided into C3Non-customer Class.
Using classification method (specific steps are as described in above-mentioned specific embodiment) of the invention and traditional naive Bayesian Classification method (specific steps are as described in above-mentioned comparative example) is classified client each in certain amount, and to a fixed number Purpose client has carried out tracking the actual type for determining each client for a long time, then will wherein use classification method of the invention Client segmentation situation and the actual classification of client are compared with traditional Naive Bayes Classification method, comparison result is specific As shown in table 1.
Table 1:
Specific example Comparative example
Sorting algorithm Sorting algorithm of the invention Traditional Naive Bayes Classification Algorithm
Classification accuracy 93.42% 68.25%
The standard that can be classified from upper table 1 using the processing method of client segmentation provided by the invention to specific customer type Exactness is compared using traditional Naive Bayes Classification method height, simultaneously because being filtered to discriminator, the quantity of discriminator Reduce, to also drop to the complexity calculated in specific client classification of type calculating process, and reduces calculation amount.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of processing method of client segmentation, characterized by comprising:
Step S1, practical client model is established, determines client's classification C in the practical client model;
Step S2, the discriminator for judging client characteristics attribute is established;
Step S3, it is corresponding each that the frequency of every kind of client's classification C, every kind of client's classification C in the practical client model are calculated The frequency of other discriminators under the conditions of the corresponding frequency of characteristic attribute and each discriminator;
Step S4, according to step S3 calculate resulting data calculate between every two discriminator, each discriminator and client's classification C Between symmetrical uncertainty value;
Step S5, discriminator to be deleted is searched according to step S4 calculated result, and executes the delete operation of discriminator to be deleted, Retain the strong discriminator of mutual independence;
Step S6, customer data is acquired, the discriminator retained by step S5 carries out the client according to the customer data Naive Bayes Classification.
2. the processing method of client segmentation according to claim 1, which is characterized in that it is described according to calculated result search to The discriminator of deletion, and the delete operation for executing discriminator to be deleted specifically includes:
Step S51, the maximum discriminator Ap of symmetrical uncertainty value between lookup and client's classification C;
Step S52, the symmetrical uncertainty value deleted between other discriminators is greater than the symmetrical uncertainty value of maximum SUP, CDiscriminator, retain other discriminators.
3. the processing method of client segmentation according to claim 2, which is characterized in that the step S52 specifically: circulation It executes as the symmetrical uncertainty value SU between the discriminator Aq and at least one other discriminator different from ApI, qBe greater than or Equal to the maximum symmetrical uncertainty value SUP, C, then the operation of the discriminator Aq is deleted, until between discriminator symmetrically not Certainty value is respectively less than the maximum symmetrical uncertainty value SUP, C
4. the processing method of client segmentation according to claim 2 or 3, which is characterized in that also wrapped before the step S51 It includes:
Step S511, by the symmetrical uncertainty value and preset symmetrically uncertain threshold value between each discriminator and client's classification It is compared;
Step S512, the symmetrical uncertainty value between deletion and client's classification is greater than or equal to the symmetrical uncertain threshold value Discriminator.
5. the processing method of client segmentation according to claim 1, which is characterized in that the mirror of the foundation in the step S2 Other device be at least 3, each discriminator include at least two client characteristics attributes.
6. a kind of processing unit of client segmentation characterized by comprising
Model building module determines client's classification C in the practical client model for establishing practical client model;
Discriminator establishes module, for establishing the discriminator for judging client characteristics attribute;
Computing module, it is corresponding for calculating the frequency of every kind of client's classification C in the practical client model, every kind of client's classification C The corresponding frequency of each characteristic attribute and each discriminator under the conditions of other discriminators frequency;
Discriminator filtering module calculates between every two discriminator, Mei Gejian for calculating resulting data according to computing module Symmetrical uncertainty value between other device and client's classification C, and discriminator to be deleted is searched according to calculated result, and execute to The delete operation for deleting discriminator, retains the strong discriminator of mutual independence
Client segmentation module, for acquiring customer data, by the discriminator of reservation according to the customer data to the client Carry out Naive Bayes Classification.
7. the processing unit of client segmentation according to claim 6, which is characterized in that the discriminator filtering module, packet It includes:
Searching unit, for searching the maximum discriminator Ap of symmetrical uncertainty value between client's classification C;
First deletes unit, maximum symmetrical not true greater than described for deleting the symmetrical uncertainty value between other discriminators Qualitative value SUP, CDiscriminator, retain other discriminators.
8. the processing unit of client segmentation according to claim 7, which is characterized in that described first deletes unit, specifically For recycling the symmetrical uncertainty value executed when between the discriminator Aq and at least one other discriminator different from Ap SUI, qMore than or equal to the maximum symmetrical uncertainty value SUP, C, then the operation of the discriminator Aq is deleted, until discriminator Between symmetrical uncertainty value be respectively less than the maximum symmetrical uncertainty value SUP, C
9. the processing unit of client segmentation according to claim 7 or 8, which is characterized in that the discriminator filtering module Further include:
Comparing unit, for searching the maximum discriminator Ap of symmetrical uncertainty value between client's classification C in searching unit Before, the symmetrical uncertainty value between each discriminator and client's classification is compared with preset symmetrically uncertain threshold value Compared with;
Second deletes unit, symmetrical not true more than or equal to described for deleting the symmetrical uncertainty value between client's classification Determine the discriminator of threshold value.
10. the processing unit of client segmentation according to claim 6, which is characterized in that the discriminator of the foundation is at least Two client characteristics attributes are included at least for 3, each discriminator.
CN201410468888.1A 2014-09-15 2014-09-15 The treating method and apparatus of client segmentation Active CN105447050B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410468888.1A CN105447050B (en) 2014-09-15 2014-09-15 The treating method and apparatus of client segmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410468888.1A CN105447050B (en) 2014-09-15 2014-09-15 The treating method and apparatus of client segmentation

Publications (2)

Publication Number Publication Date
CN105447050A CN105447050A (en) 2016-03-30
CN105447050B true CN105447050B (en) 2019-04-02

Family

ID=55557232

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410468888.1A Active CN105447050B (en) 2014-09-15 2014-09-15 The treating method and apparatus of client segmentation

Country Status (1)

Country Link
CN (1) CN105447050B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107526780A (en) * 2017-07-22 2017-12-29 长沙兔子代跑网络科技有限公司 A kind of method and device for the intelligent excavating generation race client that drawn a portrait according to user

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002014964A (en) * 2000-06-28 2002-01-18 Victor Co Of Japan Ltd Information providing system and method
JP2005208709A (en) * 2004-01-20 2005-08-04 Fuji Xerox Co Ltd Data classification processing apparatus, data classification processing method and computer program
CN101430716A (en) * 2008-12-12 2009-05-13 王强 Data processing method and system based on attribute
CN103034713A (en) * 2011-12-12 2013-04-10 微软公司 Recognizing missing offerings in marketplace
CN103502994A (en) * 2011-05-10 2014-01-08 纳格拉影像股份有限公司 Method for handling privacy data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002014964A (en) * 2000-06-28 2002-01-18 Victor Co Of Japan Ltd Information providing system and method
JP2005208709A (en) * 2004-01-20 2005-08-04 Fuji Xerox Co Ltd Data classification processing apparatus, data classification processing method and computer program
CN101430716A (en) * 2008-12-12 2009-05-13 王强 Data processing method and system based on attribute
CN103502994A (en) * 2011-05-10 2014-01-08 纳格拉影像股份有限公司 Method for handling privacy data
CN103034713A (en) * 2011-12-12 2013-04-10 微软公司 Recognizing missing offerings in marketplace

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于动态分类器集成选择的不完整数据客户分类方法实证研究;张婷婷等;《管理评论》;20120625;第83-86页

Also Published As

Publication number Publication date
CN105447050A (en) 2016-03-30

Similar Documents

Publication Publication Date Title
US10482093B2 (en) Data mining method
CN108229314B (en) Target person searching method and device and electronic equipment
CN109284733B (en) Shopping guide negative behavior monitoring method based on yolo and multitask convolutional neural network
CN106856015B (en) A kind of Work attendance method and device
CN104537341B (en) Face picture information getting method and device
WO2020177450A1 (en) Information merging method, transaction query method and apparatus, computer and storage medium
WO2017114276A1 (en) User analysis method and system based on image
US20160092821A1 (en) Non-transitory computer readable medium storing information presenting program and information processing apparatus and method
CN109684302A (en) Data predication method, device, equipment and computer readable storage medium
CN105677757B (en) It is a kind of based on double big data similarity join methods for sewing filtering
US20130091145A1 (en) Method and apparatus for analyzing web trends based on issue template extraction
US11537639B2 (en) Re-identification of physical objects in an image background via creation and storage of temporary data objects that link an object to a background
WO2021003803A1 (en) Data processing method and apparatus, storage medium and electronic device
CN112750038B (en) Transaction risk determination method, device and server
CN104268214B (en) A kind of user&#39;s gender identification method and system based on microblog users relation
CN107944946B (en) Commodity label generation method and device
CN105447050B (en) The treating method and apparatus of client segmentation
CN107679862B (en) Method and device for determining characteristic value of fraud transaction model
WO2021212760A1 (en) Method and apparatus for determining identity type of person, and electronic system
CN109800215A (en) Method, apparatus, computer storage medium and the terminal of a kind of pair of mark processing
CN104484330B (en) Comment spam pre-selection method and device based on stepping keyword threshold value combined evaluation
JPWO2013157603A1 (en) Search query analysis device, search query analysis method, and program
CN108334602B (en) Data annotation method and device, electronic equipment and computer storage medium
US11854369B2 (en) Multi-computer processing system for compliance monitoring and control
CN110472680B (en) Object classification method, device and computer-readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant