CN107194815B - Client segmentation method and system - Google Patents

Client segmentation method and system Download PDF

Info

Publication number
CN107194815B
CN107194815B CN201611005111.7A CN201611005111A CN107194815B CN 107194815 B CN107194815 B CN 107194815B CN 201611005111 A CN201611005111 A CN 201611005111A CN 107194815 B CN107194815 B CN 107194815B
Authority
CN
China
Prior art keywords
client
clients
information field
distance
local density
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611005111.7A
Other languages
Chinese (zh)
Other versions
CN107194815A (en
Inventor
马向东
吴海波
冯雨旸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201611005111.7A priority Critical patent/CN107194815B/en
Priority to PCT/CN2017/091365 priority patent/WO2018090643A1/en
Publication of CN107194815A publication Critical patent/CN107194815A/en
Application granted granted Critical
Publication of CN107194815B publication Critical patent/CN107194815B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Abstract

The invention discloses a kind of client segmentation method and system, this method includes:Obtain the information of all clients;Preset information field is screened from the information of each client;Density-based algorithms model is established, the corresponding local density of each client is calculated according to the information field screened;All clients are divided into according to the local density calculated by different classifications.It is possible thereby to accurately comprehensively classify to client, effective reference frame is provided for product promotion.

Description

Client segmentation method and system
Technical field
The present invention relates to technical field of data processing more particularly to client segmentation method and system.
Background technology
In insurance industry, it usually needs statistic of classification is carried out to the client to insure, to facilitate business personnel according to client Classification makes different marketing strategies.But the existing mode classified to client also rest on according to the age, protection amount, The stage that the data such as premium directly divide.The evaluation condition of which is few, result accuracy is not high, can not excavate inside data Deeper information, thus product promotion can not be done to business personnel, effective reference frame is provided.
Invention content
In view of this, the purpose of the present invention is to provide a kind of client segmentation method and system, how accurate complete to solve The problem of classifying to client to face.
To achieve the above object, the present invention provides a kind of client segmentation method, and the method comprising the steps of:
Obtain the information of all clients;
Preset information field is screened from the information of each client;
Density-based algorithms model is established, the corresponding part of each client is calculated according to the information field screened Density;And
All clients are divided into according to the local density calculated by different classifications.
Preferably, it is previous to include area where client, client unit one belongs to property, client for the preset information field It buys insurance kind responsibility, protection amount, premium and Claims Resolution information, the content of each information field and both corresponds to a numerical value.
Preferably, it is described to establish density-based algorithms model, each visitor is calculated according to the information field screened The step of corresponding local density in family, specifically includes:
The distance between two clients are assessed according to Euclidean distance formula;
Threshold value d for distinguishing client's similarity is setc
According to the threshold value dcLocal density corresponding with each client of local density formula calculating.
Preferably, the Euclidean distance formula is
Wherein dijFor the distance between client i and client j, xi1~ximThe numerical value of the m information field of corresponding client i, xj1 ~xjmThe numerical value of the m information field of corresponding client j.
Preferably, the threshold value dcThe condition of satisfaction is:The distance between each two client calculated dijValue, dcValue be more than or equal to all dijIn 80% value.
Preferably, local density's formula is
Wherein
Preferably, described the step of all clients are divided into different classifications according to result of calculation, specifically includes:
By the local density calculated by sorting from big to small;
All clients are divided by K classification as reference point using K client of local density's maximum;
Judge the optimum value of the classification number K;
The category division to all clients is completed according to the best classification number judged.
Preferably, all clients are divided into the step of K classification as reference point by the K client using local density's maximum Suddenly it specifically includes:
According to K client of sequencing selection local density maximum as reference point;
The similar client that K reference point is less than to the threshold value to distance respectively is classified as one kind;
For client remaining after classification, each the distance between remaining client and the K reference point are calculated respectively, The remaining client and closest reference point are classified as one kind.
Preferably, the step of optimum value of the judgement classification number K specifically includes:
Regard all clients as a domain, wherein each client is a sample;
For the K classification, calculate the center of each classification to first distance at the center in entire domain and;
For each classification, calculate respectively each sample in the category to category center second distance with;
Calculate the summation of the corresponding second distance sum of all K classifications, be denoted as third distance and;
Calculate first distance and with third distance and the ratio between;
Corresponding classification number K is as optimum value during using ratio maximum.
All clients can accurately be divided by client segmentation method proposed by the present invention comprehensively according to client's property Different classifications, and classification number is optimized, make classification more reasonable, product promotion offer can be provided to business personnel Effective reference frame is conducive to business personnel's precision marketing.
To achieve the above object, the present invention also proposes a kind of client segmentation system, which includes:
Acquisition module, for obtaining the information of all clients;
Screening module, for screening preset information field from the information of each client;
Computing module for establishing density-based algorithms model, calculates each according to the information field screened The corresponding local density of client;And
Sort module, for all clients to be divided into different classifications according to the local density calculated.
Preferably, the process of the corresponding local density of each client of the computing module calculating specifically includes:
The distance between two clients are assessed according to Euclidean distance formula;
Threshold value d for distinguishing client's similarity is setc
According to the threshold value dcLocal density corresponding with each client of local density formula calculating.
Preferably, all clients are divided into the process of different classifications according to result of calculation and specifically wrapped by the sort module It includes:
By the local density calculated by sorting from big to small;
All clients are divided by K classification as reference point using K client of local density's maximum;
Judge the optimum value of the classification number K;
The category division to all clients is completed according to the best classification number judged.
All clients can accurately be divided by client segmentation system proposed by the present invention comprehensively according to client's property Different classifications, and classification number is optimized, make classification more reasonable, product promotion offer can be provided to business personnel Effective reference frame is conducive to business personnel's precision marketing.
Description of the drawings
Fig. 1 is a kind of flow chart for client segmentation method that first embodiment of the invention proposes;
Fig. 2 is the particular flow sheet of step S104 in Fig. 1;
Fig. 3 is the particular flow sheet of step S106 in Fig. 1;
Fig. 4 is the particular flow sheet of step S302 in Fig. 3;
Fig. 5 is a kind of module diagram for client segmentation system that second embodiment of the invention proposes;
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
In order to make technical problems, technical solutions and advantages to be solved clearer, clear, tie below Drawings and examples are closed, the present invention will be described in further detail.It should be appreciated that specific embodiment described herein is only To explain the present invention, it is not intended to limit the present invention.
First embodiment
As shown in Figure 1, first embodiment of the invention proposes a kind of client segmentation method, this method includes the following steps:
S100 obtains the information of all clients.
Specifically, obtain institute it is in need progress statistic of classification client relevant information, wherein, the number of the client is N (n is positive integer).
S102 screens preset information field from the information of each client.
Specifically, the m information fields (m is positive integer) for having reference value can be preset, to divide as to client The foundation of class.I.e. each client includes m effective information fields, for example, area where client, client unit one belongs to property, Client bought insurance kind responsibility, protection amount, premium and Claims Resolution information etc. in the past.
In the present embodiment, the content of the m information field can be converted to corresponding numerical value, subsequently to calculate The distance between client, so as to judge the similarity between client.For example, area where client is Beijing then by corresponding information Field is denoted as numerical value 1, and corresponding information field is then denoted as numerical value 2 etc. by client location for Shanghai, can be according to client location Geographical location is far and near or the setting conditions such as city size to set corresponding numerical value for each location.For another example, client Corresponding information field is then denoted as numerical value 1 by protection amount for less than 100,000, and the protection amount of client then remembers corresponding information field for 10-50 ten thousand For numerical value 2, corresponding information field is then denoted as numerical value 3 etc. by the protection amount of client for 50-100 ten thousand.
S104 establishes density-based algorithms model, and calculating each client according to the information field screened corresponds to Local density.
Specifically, as shown in fig.2, particular flow sheet for the step S104.The flow includes step:
S200 assesses the distance between two clients according to Euclidean distance formula.
In the present embodiment, the Euclidean distance formula is
Wherein dijFor the distance between client i (i=1,2 ..., n) and client j (j=1,2 ..., n), xi1~ximIt is corresponding The numerical value of the m information field of client i, xj1~xjmThe numerical value of the m information field of corresponding client j.The distance is used to reflect Similarity between two clients, the distance d calculatedijValue it is smaller, represent client i and client j between it is more similar.
In the present embodiment, it for the n client, is wherein required for calculating the distance d between each two clientij, So as to judge the similarity between each two client.
S202 sets the threshold value for distinguishing client's similarity.
In the present embodiment, the threshold value is denoted as dc, for distinguishing more similar between each two client or less phase Seemingly, needing the condition met is:The distance between each two client calculated dijValue, dcValue is more than or equal to all dij In 80% value.For example, it is assumed that the d calculated for all clientsij100 are shared, then the threshold value dcIt needs to be more than or equal to Wherein 80 dijValue.As the distance between two clients dijLess than the threshold value dcWhen, it is believed that two clients are more similar; As the distance between two clients dijMore than or equal to the threshold value dcWhen, it is believed that two clients are less similar.
S204, according to threshold value local density corresponding with each client of local density formula calculating.
In the present embodiment, local density's formula is
Wherein
The local density is for reflecting the quantity of other clients more similar to the client, when the office calculated Portion's density is bigger, represents that the quantity of other clients more similar to the client is more.
Fig. 1, S106 are returned to, all clients are divided into according to result of calculation by different classifications.
Specifically, as shown in fig.3, particular flow sheet for the step S106.The flow includes step:
S300, by the local density calculated by sorting from big to small.
Specifically, for each client, a corresponding local density can be all calculated, i.e. n client will correspond to n Local density, then by the n local density by sorting from big to small.
All clients are divided into K classification (0 by S302 using K client of local density's maximum as reference point<K<n).Institute It states reference point to refer to the client as the standard for dividing classification, i.e. other more similar to the client as reference point are objective Family can be classified as one kind with the client.
Specifically, as shown in fig.4, particular flow sheet for the step S302.The flow includes step:
S400, according to K client of sequencing selection local density maximum as reference point.
For example, 3 clients A, B, C of local density's maximum are selected as reference point.
S402, the similar client which is less than to the threshold value to distance respectively are classified as one kind.
For example, for above-mentioned client A, the distance between the client A is found out less than the threshold value dcAll similar visitors All clients more similar to the client A (are found out) in family, and the client A and the client found out then are classified as the first kind Not.For above-mentioned client B, the distance between the client B is found out less than the threshold value dcAll similar clients (find out institute Have the client more similar to the client B), the client B and the client found out are then classified as second category.For above-mentioned visitor Family C finds out the distance between the client C less than the threshold value dcAll similar clients (find out all with the client C ratios More similar client), the client C and the client found out are then classified as third classification.
S404, for remaining client after the classification, calculate respectively between each client and the K reference point away from From the client and closest reference point are classified as one kind.
For example, it is assumed that client A and client A1、A2、A3It is classified as first category, client B and client B1Second category is classified as, visitor Family C and client C1、C2Third classification is classified as, client D, E is in addition there remains and is not classified.Therefore, client D and ginseng are calculated respectively According to the distance between the distance between client A, B, C and client E and reference point client A, B, C, it is assumed that client D and client B The distance between recently, client D recently, is then classified as second category by the distance between client E and client A, and client E is classified as the One classification.
Fig. 3, S304 are returned to, judges the optimum value of the classification number K.
Specifically, when the client's number K for being elected to be reference point is differed, K different client's classifications can also be obtained.Example Such as, when selecting 3 clients of local density's maximum as reference point, all clients will be divided into 3 classifications;When selection office When 4 clients of portion's density maximum are as reference point, all clients will be divided into 4 classifications, and so on.Therefore, it is necessary to The optimum value of the classification number K is judged according to scheduled algorithm, so that corresponding classification is most reasonable.
In the present embodiment, all clients can be regarded to a domain U as, wherein each client is a sample (common n sample This), each sample corresponds to m attribute (i.e. described information field), and all samples in the U of the domain are divided into K classification.First For K client's classification, the other center of each customer class is calculated to first distance and D at the center in entire domain1, then it is directed to Each client's classification calculates each sample (client) in client's classification to the second distance of client's class center respectively And D2, and the summation of the corresponding second distance sum of all K client classifications is calculated, it is denoted as third distance and D3, finally calculate First distance and with third distance and the ratio between D1/D3, by D1/D3Corresponding client's classification number K is as most during ratio maximum Good value.Wherein described center refers to each attribute of corresponding sample being averaged.Such as client's class center is by this All specimen needles included in client's classification are averaged each attribute, and the center in entire domain is that will be included in entire domain All specimen needles are averaged each attribute.
For example, it is assumed that when the classification number is K1When, calculate corresponding D1/D3=R1;When the classification number is K2 When, calculate corresponding D1/D3=R2;When the classification number is K3When, calculate corresponding D1/D3=R3, and R2>R3> R1, then by R2Corresponding classification number K2As optimum value.That is, in these cases, all clients are divided into K2It is a Classification is the most reasonable.
S306 completes the category division to all clients according to the best classification number judged.
For example, it is assumed that the optimum value for judging the classification number K is 4, then according to the 4 of above-mentioned selection local density maximum All clients are divided into 4 classes otherwise, completed to there is the category division of client by a client as reference point.
Client segmentation method described in the present embodiment comprehensively accurately can divide all clients according to client's property For different classifications, and classification number is optimized, makes classification more reasonable, product promotion can be done to business personnel and carried For effective reference frame, be conducive to business personnel's precision marketing.
Second embodiment
As shown in figure 5, second embodiment of the invention proposes a kind of client segmentation system 50.In the present embodiment, the visitor Family categorizing system 50 includes acquisition module 502, screening module 504, computing module 506 and sort module 508.
The acquisition module 502, for obtaining the information of all clients.
Specifically, acquisition module 502 obtain institute it is in need progress statistic of classification client relevant information, wherein, it is described The number of client is n (n is positive integer).
The screening module 504, for screening preset information field from the information of each client.
Specifically, the m information fields (m is positive integer) for having reference value can be preset, to divide as to client The foundation of class.I.e. each client includes m effective information fields, for example, area where client, client unit one belongs to property, Client bought insurance kind responsibility, protection amount, premium and Claims Resolution information etc. in the past.
In the present embodiment, the content of the m information field can be converted to corresponding numerical value, subsequently to calculate The distance between client, so as to judge the similarity between client.For example, area where client is Beijing then by corresponding information Field is denoted as numerical value 1, and corresponding information field is then denoted as numerical value 2 etc. by client location for Shanghai, can be according to client location Geographical location is far and near or the setting conditions such as city size to set corresponding numerical value for each location.For another example, client Corresponding information field is then denoted as numerical value 1 by protection amount for less than 100,000, and the protection amount of client then remembers corresponding information field for 10-50 ten thousand For numerical value 2, corresponding information field is then denoted as numerical value 3 etc. by the protection amount of client for 50-100 ten thousand.
The computing module 506, for establishing density-based algorithms model, according to the information field meter screened Calculate the corresponding local density of each client.
Specifically, computing module 506 assesses the distance between two clients according to Euclidean distance formula first.In this implementation In example, the Euclidean distance formula is
Wherein dijFor the distance between client i (i=1,2 ..., n) and client j (j=1,2 ..., n), xi1~ximIt is corresponding The numerical value of the m information field of client i, xj1~xjmThe numerical value of the m information field of corresponding client j.The distance is used to reflect Similarity between two clients, the distance d calculatedijValue it is smaller, represent client i and client j between it is more similar.
In the present embodiment, it for the n client, is wherein required for calculating the distance d between each two clientij, So as to judge the similarity between each two client.
Computing module 506 sets the threshold value for distinguishing client's similarity.In the present embodiment, the threshold value is denoted as dc, It is more similar or less similar between each two client for distinguishing, the condition met is needed to be:Every two calculated The distance between a client dijValue, dcValue is more than or equal to all dijIn 80% value.For example, it is assumed that it is calculated for all clients The d gone outij100 are shared, then the threshold value dcIt needs to be more than or equal to wherein 80 dijValue.As the distance between two clients dijLess than the threshold value dcWhen, it is believed that two clients are more similar;As the distance between two clients dijMore than or equal to described Threshold value dcWhen, it is believed that two clients are less similar.
Computing module 506 is according to threshold value local density corresponding with each client of local density formula calculating.At this In embodiment, local density's formula is
Wherein
The local density is for reflecting the quantity of other clients more similar to the client, when the office calculated Portion's density is bigger, represents that the quantity of other clients more similar to the client is more.
The sort module 508, for all clients to be divided into different classifications according to result of calculation.
Specifically, sort module 508 first by the local density calculated by sorting from big to small.For each visitor Family can all calculate a corresponding local density, i.e. n client will correspond to n local density, then that this n part is close Degree by sorting from big to small.
Then, all clients are divided into K class by sort module 508 using K client of local density's maximum as reference point Not (0<K<n).It specifically includes:
(1) according to K client of sequencing selection local density maximum as reference point.For example, selection local density 3 maximum clients A, B, C are as reference point.The reference point refer to by the client as divide classification standard, i.e., with this Other more similar clients of client as reference point can be classified as one kind with the client.
(2) the similar client that the K reference point is less than to the threshold value to distance respectively is classified as one kind.For example, for upper Client A is stated, finds out the distance between the client A less than the threshold value dcAll similar clients (find out all with the visitor Client more similar family A), the client A and the client found out are then classified as first category.For above-mentioned client B, find out The distance between the client B is less than the threshold value dcAll similar clients (find out all more similar to the client B Client), the client B and the client found out are then classified as second category.For above-mentioned client C, find out between the client C Distance be less than the threshold value dcAll similar clients's (finding out all clients more similar to the client C), then will The client C and the client found out are classified as third classification.
(3) for remaining client after the classification, the distance between each client and the K reference point are calculated respectively, The client and closest reference point are classified as one kind.For example, it is assumed that client A and client A1、A2、A3First category is classified as, visitor Family B and client B1It is classified as second category, client C and client C1、C2Third classification is classified as, client D, E is in addition there remains and is not returned Class.Therefore, calculate respectively the distance between client D and reference point client A, B, C and client E and reference point client A, B, C it Between distance, it is assumed that the distance between client D and client B are nearest, the distance between client E and client A recently, then by client D Second category is classified as, client E is classified as first category.
Then, sort module 508 judges the optimum value of the classification number K.Specifically, as the client for being elected to be reference point When number K is differed, K different client's classifications can be also obtained.For example, when selecting 3 clients of local density's maximum as ginseng According to when, all clients will be divided into 3 classifications;When selecting 4 clients of local density's maximum as reference point, own Client will be divided into 4 classifications, and so on.Therefore, it is necessary to judge the classification number K's according to scheduled algorithm Optimum value, so that corresponding classification is most reasonable.
In the present embodiment, all clients can be regarded to a domain U as, wherein each client is a sample (common n sample This), each sample corresponds to m attribute (i.e. described information field), and all samples in the U of the domain are divided into K classification.First For K client's classification, the other center of each customer class is calculated to first distance and D at the center in entire domain1, then it is directed to Each client's classification calculates each sample (client) in client's classification to the second distance of client's class center respectively And D2, and the summation of the corresponding second distance sum of all K client classifications is calculated, it is denoted as third distance and D3, finally calculate First distance and with third distance and the ratio between D1/D3, by D1/D3Corresponding client's classification number K is as most during ratio maximum Good value.Wherein described center refers to each attribute of corresponding sample being averaged.Such as client's class center is by this All specimen needles included in client's classification are averaged each attribute, and the center in entire domain is that will be included in entire domain All specimen needles are averaged each attribute.
For example, it is assumed that when the classification number is K1When, calculate corresponding D1/D3=R1;When the classification number is K2 When, calculate corresponding D1/D3=R2;When the classification number is K3When, calculate corresponding D1/D3=R3, and R2>R3> R1, then by R2Corresponding classification number K2As optimum value.That is, in these cases, all clients are divided into K2It is a Classification is the most reasonable.
Finally, sort module 508 completes the category division to all clients according to the best classification number judged.Example Such as, it is assumed that it is 4 to judge the optimum value of the classification number K, then according to above-mentioned 4 clients for selecting local density's maximum as All clients are divided into 4 classes otherwise, completed to there is the category division of client by reference point.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
It should be noted that herein, term " comprising ", "comprising" or its any other variant are intended to non-row His property includes, so that process, method, article or device including a series of elements not only include those elements, and And it further includes other elements that are not explicitly listed or further includes intrinsic for this process, method, article or device institute Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including this Also there are other identical elements in the process of element, method, article or device.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can add the mode of required general hardware platform to realize by software, naturally it is also possible to be realized by hardware, but very much In the case of the former be more preferably embodiment.Based on such understanding, technical scheme of the present invention is substantially in other words to existing The part that technology contributes can be embodied in the form of software product, which is stored in a storage In medium (such as ROM/RAM, magnetic disc, CD), used including some instructions so that a station terminal equipment (can be mobile phone, calculate Machine, server, air conditioner or network equipment etc.) perform method described in each embodiment of the present invention.
Above by reference to the preferred embodiment of the present invention has been illustrated, not thereby limit to the interest field of the present invention.On It states that serial number of the embodiment of the present invention is for illustration only, does not represent the quality of embodiment.It is patrolled in addition, though showing in flow charts Sequence is collected, but in some cases, it can be with the steps shown or described are performed in an order that is different from the one herein.
Those skilled in the art do not depart from the scope of the present invention and essence, can there are many variant scheme realize the present invention, It can be used for another embodiment for example as the feature of one embodiment and obtain another embodiment.All technologies with the present invention The all any modification, equivalent and improvement made within design, should all be within the interest field of the present invention.

Claims (12)

  1. A kind of 1. client segmentation method, which is characterized in that the method comprising the steps of:
    Obtain the information of all clients;
    Preset information field is filtered out from the information of each client, information field and client institute including client location Buy the information field of product;
    The content of the information field filtered out is made into numeralization processing according to the attributive character of each information field, including:According to visitor The geographical location distance of family location or the corresponding numerical value of information field of city size setting client location, according to visitor The corresponding numerical value of information field that the amount of money section setting client that product is related to buys product is bought at family, is respectively filtered out The corresponding numerical value of information field;
    Density-based algorithms model is established, according to each client couple of the corresponding numerical computations of the information field filtered out The local density answered;And
    All clients are divided into according to the local density calculated by different classifications.
  2. 2. client segmentation method according to claim 1, which is characterized in that the preset information field further includes client Unit one belongs to's property, client buy insurance kind responsibility, protection amount, premium and the Claims Resolution information of product.
  3. 3. client segmentation method according to claim 1, which is characterized in that described to establish density-based algorithms mould The step of type, local density corresponding according to each client of the corresponding numerical computations of the information field filtered out, specifically includes:
    The distance between two clients are assessed according to Euclidean distance formula;
    Threshold value d for distinguishing client's similarity is setc
    According to the threshold value dcLocal density corresponding with each client of local density formula calculating.
  4. 4. client segmentation method according to claim 3, which is characterized in that the Euclidean distance formula is
    Wherein dijFor the distance between client i and client j, xi1~ximFor the corresponding numerical value of m information field of client i, xj1~ xjmThe corresponding numerical value of m information field for client j.
  5. 5. client segmentation method according to claim 4, which is characterized in that the threshold value dcThe condition of satisfaction is:Statistics meter The distance between each two client of calculating dijValue, dcValue be more than or equal to all dijIn 80% value.
  6. 6. client segmentation method according to claim 4, which is characterized in that local density's formula is
    Wherein
  7. 7. client segmentation method according to claim 1, which is characterized in that the local density that the basis calculates is by institute There is the step of client is divided into different classifications to specifically include:
    By the local density calculated by sorting from big to small;
    All clients are divided by K classification as reference point using K client of local density's maximum;
    Judge the optimum value of the classification number K;
    The category division to all clients is completed according to the best classification number judged.
  8. 8. client segmentation method according to claim 7, which is characterized in that the K client with local density's maximum The step of all clients are divided into K classification for reference point specifically includes:
    According to K client of sequencing selection local density maximum as reference point;
    The similar client that K reference point is less than to threshold value to distance respectively is classified as one kind;
    For client remaining after classification, each the distance between remaining client and the K reference point are calculated respectively, by institute It states remaining client and is classified as one kind with closest reference point.
  9. 9. client segmentation method according to claim 7, which is characterized in that described to judge that the classification number K's is best The step of value, specifically includes:
    Regard all clients as a domain, wherein each client is a sample;
    For the K classification, calculate the center of each classification to first distance at the center in entire domain and;
    For each classification, calculate respectively each sample in the category to category center second distance with;
    Calculate the summation of the corresponding second distance sum of all K classifications, be denoted as third distance and;
    Calculate first distance and with third distance and the ratio between;
    Corresponding classification number K is as optimum value during using ratio maximum.
  10. 10. a kind of client segmentation system, which is characterized in that the system includes:
    Acquisition module, for obtaining the information of all clients;
    Screening module for filtering out preset information field from the information of each client, includes the letter of client location Breath field and client buy the information field of product;
    Computing module, for the content of the information field filtered out to be made according to the attributive character of each information field at numeralization Reason, including:According to the geographical location of client location distance or the information field pair of city size setting client location The numerical value answered buys the corresponding number of information field that the amount of money section setting client that product is related to buys product according to client Value, the corresponding numerical value of information field respectively filtered out;Density-based algorithms model is established, according to what is filtered out The corresponding local density of each client of the corresponding numerical computations of information field;And
    Sort module, for all clients to be divided into different classifications according to the local density calculated.
  11. 11. client segmentation system according to claim 10, which is characterized in that the computing module calculates each client couple The process for the local density answered specifically includes:
    The distance between two clients are assessed according to Euclidean distance formula;
    Threshold value d for distinguishing client's similarity is setc
    According to the threshold value dcLocal density corresponding with each client of local density formula calculating.
  12. 12. client segmentation system according to claim 10, which is characterized in that the sort module will according to result of calculation The process that all clients are divided into different classifications specifically includes:
    By the local density calculated by sorting from big to small;
    All clients are divided by K classification as reference point using K client of local density's maximum;
    Judge the optimum value of the classification number K;
    The category division to all clients is completed according to the best classification number judged.
CN201611005111.7A 2016-11-15 2016-11-15 Client segmentation method and system Active CN107194815B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201611005111.7A CN107194815B (en) 2016-11-15 2016-11-15 Client segmentation method and system
PCT/CN2017/091365 WO2018090643A1 (en) 2016-11-15 2017-06-30 Customer classification method, and electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611005111.7A CN107194815B (en) 2016-11-15 2016-11-15 Client segmentation method and system

Publications (2)

Publication Number Publication Date
CN107194815A CN107194815A (en) 2017-09-22
CN107194815B true CN107194815B (en) 2018-06-22

Family

ID=59871619

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611005111.7A Active CN107194815B (en) 2016-11-15 2016-11-15 Client segmentation method and system

Country Status (2)

Country Link
CN (1) CN107194815B (en)
WO (1) WO2018090643A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108153824B (en) * 2017-12-06 2020-04-24 阿里巴巴集团控股有限公司 Method and device for determining target user group
CN108985950B (en) * 2018-07-13 2023-04-18 平安科技(深圳)有限公司 Electronic device, user fraud protection risk early warning method and storage medium
CN109670852A (en) * 2018-09-26 2019-04-23 平安普惠企业管理有限公司 User classification method, device, terminal and storage medium
CN113094615B (en) * 2019-12-23 2024-03-01 中国石油天然气股份有限公司 Message pushing method, device, equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20020049164A (en) * 2000-12-19 2002-06-26 오길록 The System and Method for Auto - Document - classification by Learning Category using Genetic algorithm and Term cluster
CN101420313B (en) * 2007-10-22 2011-01-12 北京搜狗科技发展有限公司 Method and system for clustering customer terminal user group
CN102339389B (en) * 2011-09-14 2013-05-29 清华大学 Fault detection method for one-class support vector machine based on density parameter optimization
CN102664961B (en) * 2012-05-04 2014-08-20 北京邮电大学 Method for anomaly detection in MapReduce environment
US9111228B2 (en) * 2012-10-29 2015-08-18 Sas Institute Inc. System and method for combining segmentation data
CN103559630A (en) * 2013-10-31 2014-02-05 华南师范大学 Customer segmentation method based on customer attribute and behavior characteristic analysis
CN104751263A (en) * 2013-12-31 2015-07-01 南京理工大学常熟研究院有限公司 Metrological calibration service oriented intelligent client grade classification method

Also Published As

Publication number Publication date
WO2018090643A1 (en) 2018-05-24
CN107194815A (en) 2017-09-22

Similar Documents

Publication Publication Date Title
CN107194815B (en) Client segmentation method and system
CN111291816B (en) Method and device for carrying out feature processing aiming at user classification model
CN106651057A (en) Mobile terminal user age prediction method based on installation package sequence table
CN107622326A (en) User&#39;s classification, available resources Forecasting Methodology, device and equipment
CN112241494A (en) Key information pushing method and device based on user behavior data
CN112396428B (en) User portrait data-based customer group classification management method and device
CN110276520A (en) Project case screening technique and device
CN109840843A (en) The automatic branch mailbox algorithm of continuous type feature based on similarity combination
CN106570015A (en) Image searching method and device
WO2018006631A1 (en) User level automatic segmentation method and system
CN106447385A (en) Data processing method and apparatus
CN109191185A (en) A kind of visitor&#39;s heap sort method and system
CN111582722B (en) Risk identification method and device, electronic equipment and readable storage medium
CN113378987A (en) Density-based unbalanced data mixed sampling algorithm
CN108563786A (en) Text classification and methods of exhibiting, device, computer equipment and storage medium
CN111325495B (en) Abnormal part classification method and system
CN112241820A (en) Risk identification method and device for key nodes in fund flow and computing equipment
CN113011503B (en) Data evidence obtaining method of electronic equipment, storage medium and terminal
CN109785050A (en) A kind of cloud resources of production matching recommended method and device
CN113919932A (en) Client scoring deviation detection method based on loan application scoring model
CN115599985A (en) Target customer identification method and system, electronic device and readable storage medium
CN111126419A (en) Dot clustering method and device
CN115187387B (en) Identification method and equipment for risk merchant
CN115438138B (en) Employment center identification method and device, electronic equipment and storage medium
CN116431459B (en) Distributed log link tracking data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant