CN107194815B - Client segmentation method and system - Google Patents
Client segmentation method and system Download PDFInfo
- Publication number
- CN107194815B CN107194815B CN201611005111.7A CN201611005111A CN107194815B CN 107194815 B CN107194815 B CN 107194815B CN 201611005111 A CN201611005111 A CN 201611005111A CN 107194815 B CN107194815 B CN 107194815B
- Authority
- CN
- China
- Prior art keywords
- client
- clients
- information field
- distance
- local density
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/08—Insurance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
Abstract
The invention discloses a kind of client segmentation method and system, this method includes:Obtain the information of all clients;Preset information field is screened from the information of each client;Density-based algorithms model is established, the corresponding local density of each client is calculated according to the information field screened;All clients are divided into according to the local density calculated by different classifications.It is possible thereby to accurately comprehensively classify to client, effective reference frame is provided for product promotion.
Description
Technical field
The present invention relates to technical field of data processing more particularly to client segmentation method and system.
Background technology
In insurance industry, it usually needs statistic of classification is carried out to the client to insure, to facilitate business personnel according to client
Classification makes different marketing strategies.But the existing mode classified to client also rest on according to the age, protection amount,
The stage that the data such as premium directly divide.The evaluation condition of which is few, result accuracy is not high, can not excavate inside data
Deeper information, thus product promotion can not be done to business personnel, effective reference frame is provided.
Invention content
In view of this, the purpose of the present invention is to provide a kind of client segmentation method and system, how accurate complete to solve
The problem of classifying to client to face.
To achieve the above object, the present invention provides a kind of client segmentation method, and the method comprising the steps of:
Obtain the information of all clients;
Preset information field is screened from the information of each client;
Density-based algorithms model is established, the corresponding part of each client is calculated according to the information field screened
Density;And
All clients are divided into according to the local density calculated by different classifications.
Preferably, it is previous to include area where client, client unit one belongs to property, client for the preset information field
It buys insurance kind responsibility, protection amount, premium and Claims Resolution information, the content of each information field and both corresponds to a numerical value.
Preferably, it is described to establish density-based algorithms model, each visitor is calculated according to the information field screened
The step of corresponding local density in family, specifically includes:
The distance between two clients are assessed according to Euclidean distance formula;
Threshold value d for distinguishing client's similarity is setc;
According to the threshold value dcLocal density corresponding with each client of local density formula calculating.
Preferably, the Euclidean distance formula is
Wherein dijFor the distance between client i and client j, xi1~ximThe numerical value of the m information field of corresponding client i, xj1
~xjmThe numerical value of the m information field of corresponding client j.
Preferably, the threshold value dcThe condition of satisfaction is:The distance between each two client calculated dijValue,
dcValue be more than or equal to all dijIn 80% value.
Preferably, local density's formula is
Wherein
Preferably, described the step of all clients are divided into different classifications according to result of calculation, specifically includes:
By the local density calculated by sorting from big to small;
All clients are divided by K classification as reference point using K client of local density's maximum;
Judge the optimum value of the classification number K;
The category division to all clients is completed according to the best classification number judged.
Preferably, all clients are divided into the step of K classification as reference point by the K client using local density's maximum
Suddenly it specifically includes:
According to K client of sequencing selection local density maximum as reference point;
The similar client that K reference point is less than to the threshold value to distance respectively is classified as one kind;
For client remaining after classification, each the distance between remaining client and the K reference point are calculated respectively,
The remaining client and closest reference point are classified as one kind.
Preferably, the step of optimum value of the judgement classification number K specifically includes:
Regard all clients as a domain, wherein each client is a sample;
For the K classification, calculate the center of each classification to first distance at the center in entire domain and;
For each classification, calculate respectively each sample in the category to category center second distance with;
Calculate the summation of the corresponding second distance sum of all K classifications, be denoted as third distance and;
Calculate first distance and with third distance and the ratio between;
Corresponding classification number K is as optimum value during using ratio maximum.
All clients can accurately be divided by client segmentation method proposed by the present invention comprehensively according to client's property
Different classifications, and classification number is optimized, make classification more reasonable, product promotion offer can be provided to business personnel
Effective reference frame is conducive to business personnel's precision marketing.
To achieve the above object, the present invention also proposes a kind of client segmentation system, which includes:
Acquisition module, for obtaining the information of all clients;
Screening module, for screening preset information field from the information of each client;
Computing module for establishing density-based algorithms model, calculates each according to the information field screened
The corresponding local density of client;And
Sort module, for all clients to be divided into different classifications according to the local density calculated.
Preferably, the process of the corresponding local density of each client of the computing module calculating specifically includes:
The distance between two clients are assessed according to Euclidean distance formula;
Threshold value d for distinguishing client's similarity is setc;
According to the threshold value dcLocal density corresponding with each client of local density formula calculating.
Preferably, all clients are divided into the process of different classifications according to result of calculation and specifically wrapped by the sort module
It includes:
By the local density calculated by sorting from big to small;
All clients are divided by K classification as reference point using K client of local density's maximum;
Judge the optimum value of the classification number K;
The category division to all clients is completed according to the best classification number judged.
All clients can accurately be divided by client segmentation system proposed by the present invention comprehensively according to client's property
Different classifications, and classification number is optimized, make classification more reasonable, product promotion offer can be provided to business personnel
Effective reference frame is conducive to business personnel's precision marketing.
Description of the drawings
Fig. 1 is a kind of flow chart for client segmentation method that first embodiment of the invention proposes;
Fig. 2 is the particular flow sheet of step S104 in Fig. 1;
Fig. 3 is the particular flow sheet of step S106 in Fig. 1;
Fig. 4 is the particular flow sheet of step S302 in Fig. 3;
Fig. 5 is a kind of module diagram for client segmentation system that second embodiment of the invention proposes;
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
In order to make technical problems, technical solutions and advantages to be solved clearer, clear, tie below
Drawings and examples are closed, the present invention will be described in further detail.It should be appreciated that specific embodiment described herein is only
To explain the present invention, it is not intended to limit the present invention.
First embodiment
As shown in Figure 1, first embodiment of the invention proposes a kind of client segmentation method, this method includes the following steps:
S100 obtains the information of all clients.
Specifically, obtain institute it is in need progress statistic of classification client relevant information, wherein, the number of the client is
N (n is positive integer).
S102 screens preset information field from the information of each client.
Specifically, the m information fields (m is positive integer) for having reference value can be preset, to divide as to client
The foundation of class.I.e. each client includes m effective information fields, for example, area where client, client unit one belongs to property,
Client bought insurance kind responsibility, protection amount, premium and Claims Resolution information etc. in the past.
In the present embodiment, the content of the m information field can be converted to corresponding numerical value, subsequently to calculate
The distance between client, so as to judge the similarity between client.For example, area where client is Beijing then by corresponding information
Field is denoted as numerical value 1, and corresponding information field is then denoted as numerical value 2 etc. by client location for Shanghai, can be according to client location
Geographical location is far and near or the setting conditions such as city size to set corresponding numerical value for each location.For another example, client
Corresponding information field is then denoted as numerical value 1 by protection amount for less than 100,000, and the protection amount of client then remembers corresponding information field for 10-50 ten thousand
For numerical value 2, corresponding information field is then denoted as numerical value 3 etc. by the protection amount of client for 50-100 ten thousand.
S104 establishes density-based algorithms model, and calculating each client according to the information field screened corresponds to
Local density.
Specifically, as shown in fig.2, particular flow sheet for the step S104.The flow includes step:
S200 assesses the distance between two clients according to Euclidean distance formula.
In the present embodiment, the Euclidean distance formula is
Wherein dijFor the distance between client i (i=1,2 ..., n) and client j (j=1,2 ..., n), xi1~ximIt is corresponding
The numerical value of the m information field of client i, xj1~xjmThe numerical value of the m information field of corresponding client j.The distance is used to reflect
Similarity between two clients, the distance d calculatedijValue it is smaller, represent client i and client j between it is more similar.
In the present embodiment, it for the n client, is wherein required for calculating the distance d between each two clientij,
So as to judge the similarity between each two client.
S202 sets the threshold value for distinguishing client's similarity.
In the present embodiment, the threshold value is denoted as dc, for distinguishing more similar between each two client or less phase
Seemingly, needing the condition met is:The distance between each two client calculated dijValue, dcValue is more than or equal to all dij
In 80% value.For example, it is assumed that the d calculated for all clientsij100 are shared, then the threshold value dcIt needs to be more than or equal to
Wherein 80 dijValue.As the distance between two clients dijLess than the threshold value dcWhen, it is believed that two clients are more similar;
As the distance between two clients dijMore than or equal to the threshold value dcWhen, it is believed that two clients are less similar.
S204, according to threshold value local density corresponding with each client of local density formula calculating.
In the present embodiment, local density's formula is
Wherein
The local density is for reflecting the quantity of other clients more similar to the client, when the office calculated
Portion's density is bigger, represents that the quantity of other clients more similar to the client is more.
Fig. 1, S106 are returned to, all clients are divided into according to result of calculation by different classifications.
Specifically, as shown in fig.3, particular flow sheet for the step S106.The flow includes step:
S300, by the local density calculated by sorting from big to small.
Specifically, for each client, a corresponding local density can be all calculated, i.e. n client will correspond to n
Local density, then by the n local density by sorting from big to small.
All clients are divided into K classification (0 by S302 using K client of local density's maximum as reference point<K<n).Institute
It states reference point to refer to the client as the standard for dividing classification, i.e. other more similar to the client as reference point are objective
Family can be classified as one kind with the client.
Specifically, as shown in fig.4, particular flow sheet for the step S302.The flow includes step:
S400, according to K client of sequencing selection local density maximum as reference point.
For example, 3 clients A, B, C of local density's maximum are selected as reference point.
S402, the similar client which is less than to the threshold value to distance respectively are classified as one kind.
For example, for above-mentioned client A, the distance between the client A is found out less than the threshold value dcAll similar visitors
All clients more similar to the client A (are found out) in family, and the client A and the client found out then are classified as the first kind
Not.For above-mentioned client B, the distance between the client B is found out less than the threshold value dcAll similar clients (find out institute
Have the client more similar to the client B), the client B and the client found out are then classified as second category.For above-mentioned visitor
Family C finds out the distance between the client C less than the threshold value dcAll similar clients (find out all with the client C ratios
More similar client), the client C and the client found out are then classified as third classification.
S404, for remaining client after the classification, calculate respectively between each client and the K reference point away from
From the client and closest reference point are classified as one kind.
For example, it is assumed that client A and client A1、A2、A3It is classified as first category, client B and client B1Second category is classified as, visitor
Family C and client C1、C2Third classification is classified as, client D, E is in addition there remains and is not classified.Therefore, client D and ginseng are calculated respectively
According to the distance between the distance between client A, B, C and client E and reference point client A, B, C, it is assumed that client D and client B
The distance between recently, client D recently, is then classified as second category by the distance between client E and client A, and client E is classified as the
One classification.
Fig. 3, S304 are returned to, judges the optimum value of the classification number K.
Specifically, when the client's number K for being elected to be reference point is differed, K different client's classifications can also be obtained.Example
Such as, when selecting 3 clients of local density's maximum as reference point, all clients will be divided into 3 classifications;When selection office
When 4 clients of portion's density maximum are as reference point, all clients will be divided into 4 classifications, and so on.Therefore, it is necessary to
The optimum value of the classification number K is judged according to scheduled algorithm, so that corresponding classification is most reasonable.
In the present embodiment, all clients can be regarded to a domain U as, wherein each client is a sample (common n sample
This), each sample corresponds to m attribute (i.e. described information field), and all samples in the U of the domain are divided into K classification.First
For K client's classification, the other center of each customer class is calculated to first distance and D at the center in entire domain1, then it is directed to
Each client's classification calculates each sample (client) in client's classification to the second distance of client's class center respectively
And D2, and the summation of the corresponding second distance sum of all K client classifications is calculated, it is denoted as third distance and D3, finally calculate
First distance and with third distance and the ratio between D1/D3, by D1/D3Corresponding client's classification number K is as most during ratio maximum
Good value.Wherein described center refers to each attribute of corresponding sample being averaged.Such as client's class center is by this
All specimen needles included in client's classification are averaged each attribute, and the center in entire domain is that will be included in entire domain
All specimen needles are averaged each attribute.
For example, it is assumed that when the classification number is K1When, calculate corresponding D1/D3=R1;When the classification number is K2
When, calculate corresponding D1/D3=R2;When the classification number is K3When, calculate corresponding D1/D3=R3, and R2>R3>
R1, then by R2Corresponding classification number K2As optimum value.That is, in these cases, all clients are divided into K2It is a
Classification is the most reasonable.
S306 completes the category division to all clients according to the best classification number judged.
For example, it is assumed that the optimum value for judging the classification number K is 4, then according to the 4 of above-mentioned selection local density maximum
All clients are divided into 4 classes otherwise, completed to there is the category division of client by a client as reference point.
Client segmentation method described in the present embodiment comprehensively accurately can divide all clients according to client's property
For different classifications, and classification number is optimized, makes classification more reasonable, product promotion can be done to business personnel and carried
For effective reference frame, be conducive to business personnel's precision marketing.
Second embodiment
As shown in figure 5, second embodiment of the invention proposes a kind of client segmentation system 50.In the present embodiment, the visitor
Family categorizing system 50 includes acquisition module 502, screening module 504, computing module 506 and sort module 508.
The acquisition module 502, for obtaining the information of all clients.
Specifically, acquisition module 502 obtain institute it is in need progress statistic of classification client relevant information, wherein, it is described
The number of client is n (n is positive integer).
The screening module 504, for screening preset information field from the information of each client.
Specifically, the m information fields (m is positive integer) for having reference value can be preset, to divide as to client
The foundation of class.I.e. each client includes m effective information fields, for example, area where client, client unit one belongs to property,
Client bought insurance kind responsibility, protection amount, premium and Claims Resolution information etc. in the past.
In the present embodiment, the content of the m information field can be converted to corresponding numerical value, subsequently to calculate
The distance between client, so as to judge the similarity between client.For example, area where client is Beijing then by corresponding information
Field is denoted as numerical value 1, and corresponding information field is then denoted as numerical value 2 etc. by client location for Shanghai, can be according to client location
Geographical location is far and near or the setting conditions such as city size to set corresponding numerical value for each location.For another example, client
Corresponding information field is then denoted as numerical value 1 by protection amount for less than 100,000, and the protection amount of client then remembers corresponding information field for 10-50 ten thousand
For numerical value 2, corresponding information field is then denoted as numerical value 3 etc. by the protection amount of client for 50-100 ten thousand.
The computing module 506, for establishing density-based algorithms model, according to the information field meter screened
Calculate the corresponding local density of each client.
Specifically, computing module 506 assesses the distance between two clients according to Euclidean distance formula first.In this implementation
In example, the Euclidean distance formula is
Wherein dijFor the distance between client i (i=1,2 ..., n) and client j (j=1,2 ..., n), xi1~ximIt is corresponding
The numerical value of the m information field of client i, xj1~xjmThe numerical value of the m information field of corresponding client j.The distance is used to reflect
Similarity between two clients, the distance d calculatedijValue it is smaller, represent client i and client j between it is more similar.
In the present embodiment, it for the n client, is wherein required for calculating the distance d between each two clientij,
So as to judge the similarity between each two client.
Computing module 506 sets the threshold value for distinguishing client's similarity.In the present embodiment, the threshold value is denoted as dc,
It is more similar or less similar between each two client for distinguishing, the condition met is needed to be:Every two calculated
The distance between a client dijValue, dcValue is more than or equal to all dijIn 80% value.For example, it is assumed that it is calculated for all clients
The d gone outij100 are shared, then the threshold value dcIt needs to be more than or equal to wherein 80 dijValue.As the distance between two clients
dijLess than the threshold value dcWhen, it is believed that two clients are more similar;As the distance between two clients dijMore than or equal to described
Threshold value dcWhen, it is believed that two clients are less similar.
Computing module 506 is according to threshold value local density corresponding with each client of local density formula calculating.At this
In embodiment, local density's formula is
Wherein
The local density is for reflecting the quantity of other clients more similar to the client, when the office calculated
Portion's density is bigger, represents that the quantity of other clients more similar to the client is more.
The sort module 508, for all clients to be divided into different classifications according to result of calculation.
Specifically, sort module 508 first by the local density calculated by sorting from big to small.For each visitor
Family can all calculate a corresponding local density, i.e. n client will correspond to n local density, then that this n part is close
Degree by sorting from big to small.
Then, all clients are divided into K class by sort module 508 using K client of local density's maximum as reference point
Not (0<K<n).It specifically includes:
(1) according to K client of sequencing selection local density maximum as reference point.For example, selection local density
3 maximum clients A, B, C are as reference point.The reference point refer to by the client as divide classification standard, i.e., with this
Other more similar clients of client as reference point can be classified as one kind with the client.
(2) the similar client that the K reference point is less than to the threshold value to distance respectively is classified as one kind.For example, for upper
Client A is stated, finds out the distance between the client A less than the threshold value dcAll similar clients (find out all with the visitor
Client more similar family A), the client A and the client found out are then classified as first category.For above-mentioned client B, find out
The distance between the client B is less than the threshold value dcAll similar clients (find out all more similar to the client B
Client), the client B and the client found out are then classified as second category.For above-mentioned client C, find out between the client C
Distance be less than the threshold value dcAll similar clients's (finding out all clients more similar to the client C), then will
The client C and the client found out are classified as third classification.
(3) for remaining client after the classification, the distance between each client and the K reference point are calculated respectively,
The client and closest reference point are classified as one kind.For example, it is assumed that client A and client A1、A2、A3First category is classified as, visitor
Family B and client B1It is classified as second category, client C and client C1、C2Third classification is classified as, client D, E is in addition there remains and is not returned
Class.Therefore, calculate respectively the distance between client D and reference point client A, B, C and client E and reference point client A, B, C it
Between distance, it is assumed that the distance between client D and client B are nearest, the distance between client E and client A recently, then by client D
Second category is classified as, client E is classified as first category.
Then, sort module 508 judges the optimum value of the classification number K.Specifically, as the client for being elected to be reference point
When number K is differed, K different client's classifications can be also obtained.For example, when selecting 3 clients of local density's maximum as ginseng
According to when, all clients will be divided into 3 classifications;When selecting 4 clients of local density's maximum as reference point, own
Client will be divided into 4 classifications, and so on.Therefore, it is necessary to judge the classification number K's according to scheduled algorithm
Optimum value, so that corresponding classification is most reasonable.
In the present embodiment, all clients can be regarded to a domain U as, wherein each client is a sample (common n sample
This), each sample corresponds to m attribute (i.e. described information field), and all samples in the U of the domain are divided into K classification.First
For K client's classification, the other center of each customer class is calculated to first distance and D at the center in entire domain1, then it is directed to
Each client's classification calculates each sample (client) in client's classification to the second distance of client's class center respectively
And D2, and the summation of the corresponding second distance sum of all K client classifications is calculated, it is denoted as third distance and D3, finally calculate
First distance and with third distance and the ratio between D1/D3, by D1/D3Corresponding client's classification number K is as most during ratio maximum
Good value.Wherein described center refers to each attribute of corresponding sample being averaged.Such as client's class center is by this
All specimen needles included in client's classification are averaged each attribute, and the center in entire domain is that will be included in entire domain
All specimen needles are averaged each attribute.
For example, it is assumed that when the classification number is K1When, calculate corresponding D1/D3=R1;When the classification number is K2
When, calculate corresponding D1/D3=R2;When the classification number is K3When, calculate corresponding D1/D3=R3, and R2>R3>
R1, then by R2Corresponding classification number K2As optimum value.That is, in these cases, all clients are divided into K2It is a
Classification is the most reasonable.
Finally, sort module 508 completes the category division to all clients according to the best classification number judged.Example
Such as, it is assumed that it is 4 to judge the optimum value of the classification number K, then according to above-mentioned 4 clients for selecting local density's maximum as
All clients are divided into 4 classes otherwise, completed to there is the category division of client by reference point.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
It should be noted that herein, term " comprising ", "comprising" or its any other variant are intended to non-row
His property includes, so that process, method, article or device including a series of elements not only include those elements, and
And it further includes other elements that are not explicitly listed or further includes intrinsic for this process, method, article or device institute
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including this
Also there are other identical elements in the process of element, method, article or device.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side
Method can add the mode of required general hardware platform to realize by software, naturally it is also possible to be realized by hardware, but very much
In the case of the former be more preferably embodiment.Based on such understanding, technical scheme of the present invention is substantially in other words to existing
The part that technology contributes can be embodied in the form of software product, which is stored in a storage
In medium (such as ROM/RAM, magnetic disc, CD), used including some instructions so that a station terminal equipment (can be mobile phone, calculate
Machine, server, air conditioner or network equipment etc.) perform method described in each embodiment of the present invention.
Above by reference to the preferred embodiment of the present invention has been illustrated, not thereby limit to the interest field of the present invention.On
It states that serial number of the embodiment of the present invention is for illustration only, does not represent the quality of embodiment.It is patrolled in addition, though showing in flow charts
Sequence is collected, but in some cases, it can be with the steps shown or described are performed in an order that is different from the one herein.
Those skilled in the art do not depart from the scope of the present invention and essence, can there are many variant scheme realize the present invention,
It can be used for another embodiment for example as the feature of one embodiment and obtain another embodiment.All technologies with the present invention
The all any modification, equivalent and improvement made within design, should all be within the interest field of the present invention.
Claims (12)
- A kind of 1. client segmentation method, which is characterized in that the method comprising the steps of:Obtain the information of all clients;Preset information field is filtered out from the information of each client, information field and client institute including client location Buy the information field of product;The content of the information field filtered out is made into numeralization processing according to the attributive character of each information field, including:According to visitor The geographical location distance of family location or the corresponding numerical value of information field of city size setting client location, according to visitor The corresponding numerical value of information field that the amount of money section setting client that product is related to buys product is bought at family, is respectively filtered out The corresponding numerical value of information field;Density-based algorithms model is established, according to each client couple of the corresponding numerical computations of the information field filtered out The local density answered;AndAll clients are divided into according to the local density calculated by different classifications.
- 2. client segmentation method according to claim 1, which is characterized in that the preset information field further includes client Unit one belongs to's property, client buy insurance kind responsibility, protection amount, premium and the Claims Resolution information of product.
- 3. client segmentation method according to claim 1, which is characterized in that described to establish density-based algorithms mould The step of type, local density corresponding according to each client of the corresponding numerical computations of the information field filtered out, specifically includes:The distance between two clients are assessed according to Euclidean distance formula;Threshold value d for distinguishing client's similarity is setc;According to the threshold value dcLocal density corresponding with each client of local density formula calculating.
- 4. client segmentation method according to claim 3, which is characterized in that the Euclidean distance formula isWherein dijFor the distance between client i and client j, xi1~ximFor the corresponding numerical value of m information field of client i, xj1~ xjmThe corresponding numerical value of m information field for client j.
- 5. client segmentation method according to claim 4, which is characterized in that the threshold value dcThe condition of satisfaction is:Statistics meter The distance between each two client of calculating dijValue, dcValue be more than or equal to all dijIn 80% value.
- 6. client segmentation method according to claim 4, which is characterized in that local density's formula isWherein
- 7. client segmentation method according to claim 1, which is characterized in that the local density that the basis calculates is by institute There is the step of client is divided into different classifications to specifically include:By the local density calculated by sorting from big to small;All clients are divided by K classification as reference point using K client of local density's maximum;Judge the optimum value of the classification number K;The category division to all clients is completed according to the best classification number judged.
- 8. client segmentation method according to claim 7, which is characterized in that the K client with local density's maximum The step of all clients are divided into K classification for reference point specifically includes:According to K client of sequencing selection local density maximum as reference point;The similar client that K reference point is less than to threshold value to distance respectively is classified as one kind;For client remaining after classification, each the distance between remaining client and the K reference point are calculated respectively, by institute It states remaining client and is classified as one kind with closest reference point.
- 9. client segmentation method according to claim 7, which is characterized in that described to judge that the classification number K's is best The step of value, specifically includes:Regard all clients as a domain, wherein each client is a sample;For the K classification, calculate the center of each classification to first distance at the center in entire domain and;For each classification, calculate respectively each sample in the category to category center second distance with;Calculate the summation of the corresponding second distance sum of all K classifications, be denoted as third distance and;Calculate first distance and with third distance and the ratio between;Corresponding classification number K is as optimum value during using ratio maximum.
- 10. a kind of client segmentation system, which is characterized in that the system includes:Acquisition module, for obtaining the information of all clients;Screening module for filtering out preset information field from the information of each client, includes the letter of client location Breath field and client buy the information field of product;Computing module, for the content of the information field filtered out to be made according to the attributive character of each information field at numeralization Reason, including:According to the geographical location of client location distance or the information field pair of city size setting client location The numerical value answered buys the corresponding number of information field that the amount of money section setting client that product is related to buys product according to client Value, the corresponding numerical value of information field respectively filtered out;Density-based algorithms model is established, according to what is filtered out The corresponding local density of each client of the corresponding numerical computations of information field;AndSort module, for all clients to be divided into different classifications according to the local density calculated.
- 11. client segmentation system according to claim 10, which is characterized in that the computing module calculates each client couple The process for the local density answered specifically includes:The distance between two clients are assessed according to Euclidean distance formula;Threshold value d for distinguishing client's similarity is setc;According to the threshold value dcLocal density corresponding with each client of local density formula calculating.
- 12. client segmentation system according to claim 10, which is characterized in that the sort module will according to result of calculation The process that all clients are divided into different classifications specifically includes:By the local density calculated by sorting from big to small;All clients are divided by K classification as reference point using K client of local density's maximum;Judge the optimum value of the classification number K;The category division to all clients is completed according to the best classification number judged.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611005111.7A CN107194815B (en) | 2016-11-15 | 2016-11-15 | Client segmentation method and system |
PCT/CN2017/091365 WO2018090643A1 (en) | 2016-11-15 | 2017-06-30 | Customer classification method, and electronic device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611005111.7A CN107194815B (en) | 2016-11-15 | 2016-11-15 | Client segmentation method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107194815A CN107194815A (en) | 2017-09-22 |
CN107194815B true CN107194815B (en) | 2018-06-22 |
Family
ID=59871619
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611005111.7A Active CN107194815B (en) | 2016-11-15 | 2016-11-15 | Client segmentation method and system |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN107194815B (en) |
WO (1) | WO2018090643A1 (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108153824B (en) * | 2017-12-06 | 2020-04-24 | 阿里巴巴集团控股有限公司 | Method and device for determining target user group |
CN108985950B (en) * | 2018-07-13 | 2023-04-18 | 平安科技(深圳)有限公司 | Electronic device, user fraud protection risk early warning method and storage medium |
CN109670852A (en) * | 2018-09-26 | 2019-04-23 | 平安普惠企业管理有限公司 | User classification method, device, terminal and storage medium |
CN113094615B (en) * | 2019-12-23 | 2024-03-01 | 中国石油天然气股份有限公司 | Message pushing method, device, equipment and storage medium |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20020049164A (en) * | 2000-12-19 | 2002-06-26 | 오길록 | The System and Method for Auto - Document - classification by Learning Category using Genetic algorithm and Term cluster |
CN101420313B (en) * | 2007-10-22 | 2011-01-12 | 北京搜狗科技发展有限公司 | Method and system for clustering customer terminal user group |
CN102339389B (en) * | 2011-09-14 | 2013-05-29 | 清华大学 | Fault detection method for one-class support vector machine based on density parameter optimization |
CN102664961B (en) * | 2012-05-04 | 2014-08-20 | 北京邮电大学 | Method for anomaly detection in MapReduce environment |
US9111228B2 (en) * | 2012-10-29 | 2015-08-18 | Sas Institute Inc. | System and method for combining segmentation data |
CN103559630A (en) * | 2013-10-31 | 2014-02-05 | 华南师范大学 | Customer segmentation method based on customer attribute and behavior characteristic analysis |
CN104751263A (en) * | 2013-12-31 | 2015-07-01 | 南京理工大学常熟研究院有限公司 | Metrological calibration service oriented intelligent client grade classification method |
-
2016
- 2016-11-15 CN CN201611005111.7A patent/CN107194815B/en active Active
-
2017
- 2017-06-30 WO PCT/CN2017/091365 patent/WO2018090643A1/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
WO2018090643A1 (en) | 2018-05-24 |
CN107194815A (en) | 2017-09-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107194815B (en) | Client segmentation method and system | |
CN111291816B (en) | Method and device for carrying out feature processing aiming at user classification model | |
CN106651057A (en) | Mobile terminal user age prediction method based on installation package sequence table | |
CN107622326A (en) | User's classification, available resources Forecasting Methodology, device and equipment | |
CN112241494A (en) | Key information pushing method and device based on user behavior data | |
CN112396428B (en) | User portrait data-based customer group classification management method and device | |
CN110276520A (en) | Project case screening technique and device | |
CN109840843A (en) | The automatic branch mailbox algorithm of continuous type feature based on similarity combination | |
CN106570015A (en) | Image searching method and device | |
WO2018006631A1 (en) | User level automatic segmentation method and system | |
CN106447385A (en) | Data processing method and apparatus | |
CN109191185A (en) | A kind of visitor's heap sort method and system | |
CN111582722B (en) | Risk identification method and device, electronic equipment and readable storage medium | |
CN113378987A (en) | Density-based unbalanced data mixed sampling algorithm | |
CN108563786A (en) | Text classification and methods of exhibiting, device, computer equipment and storage medium | |
CN111325495B (en) | Abnormal part classification method and system | |
CN112241820A (en) | Risk identification method and device for key nodes in fund flow and computing equipment | |
CN113011503B (en) | Data evidence obtaining method of electronic equipment, storage medium and terminal | |
CN109785050A (en) | A kind of cloud resources of production matching recommended method and device | |
CN113919932A (en) | Client scoring deviation detection method based on loan application scoring model | |
CN115599985A (en) | Target customer identification method and system, electronic device and readable storage medium | |
CN111126419A (en) | Dot clustering method and device | |
CN115187387B (en) | Identification method and equipment for risk merchant | |
CN115438138B (en) | Employment center identification method and device, electronic equipment and storage medium | |
CN116431459B (en) | Distributed log link tracking data processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |