WO2018090643A1 - 客户分类方法、电子装置及存储介质 - Google Patents

客户分类方法、电子装置及存储介质 Download PDF

Info

Publication number
WO2018090643A1
WO2018090643A1 PCT/CN2017/091365 CN2017091365W WO2018090643A1 WO 2018090643 A1 WO2018090643 A1 WO 2018090643A1 CN 2017091365 W CN2017091365 W CN 2017091365W WO 2018090643 A1 WO2018090643 A1 WO 2018090643A1
Authority
WO
WIPO (PCT)
Prior art keywords
customer
customers
distance
categories
local density
Prior art date
Application number
PCT/CN2017/091365
Other languages
English (en)
French (fr)
Inventor
马向东
吴海波
冯雨旸
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2018090643A1 publication Critical patent/WO2018090643A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Definitions

  • the present invention relates to the field of data processing technologies, and in particular, to a customer classification method, an electronic device, and a storage medium.
  • an object of the present invention is to provide a customer classification method, an electronic device, and a storage medium to solve the problem of how to accurately and comprehensively classify customers.
  • the present invention provides a customer classification method, the method comprising the steps of:
  • the present invention also provides an electronic device including: a memory, a processor, and a display.
  • the memory stores a client classification program, and when the client classification program is executed by the processor, the following steps can be implemented:
  • the present invention also provides a computer readable storage medium having a client classification program stored thereon, and when the client classification program is executed by the processor, any step of the above customer classification method can be implemented.
  • the invention has the beneficial effects that the customer classification method, the electronic device and the storage medium proposed by the invention can comprehensively and accurately divide all customers into different categories according to the nature of the customer, and the number of categories is compared with the prior art. It has been optimized to make the classification more reasonable, and it can provide an effective reference for the business personnel to promote the products, which is conducive to the precise marketing of business personnel.
  • FIG. 1 is a flowchart of a customer classification method according to a first embodiment of the present invention
  • step S104 in FIG. 1 is a specific flowchart of step S104 in FIG. 1;
  • step S106 in FIG. 1 is a specific flowchart of step S106 in FIG. 1;
  • step S302 in FIG. 3 is a specific flowchart of step S302 in FIG. 3;
  • FIG. 5 is a schematic diagram of an electronic device according to a second embodiment of the present invention.
  • FIG. 6 is a block diagram of the customer classification program of FIG. 5.
  • a first embodiment of the present invention provides a customer classification method, which includes the following steps:
  • relevant information of all customers that need to perform classification statistics is obtained, where the number of the customers is n (n is a positive integer).
  • S102 Filter a preset information field from each customer's information.
  • m reference fields may be preset as a basis for classifying customers. That is, each customer includes m valid information fields, such as the region where the customer is located, the nature of the customer's unit, the customer's previous purchase insurance liability, insurance amount, premium and claims information.
  • the contents of the m information fields can be converted into corresponding values, so as to calculate the distance between the customers, thereby determining the similarity between the customers. For example, if the customer's location is Beijing, the corresponding information field will be recorded as the value 1. If the customer's location is Shanghai, the corresponding information field will be recorded as the value 2, etc., according to the geographical location of the customer's location or the size of the city. Set the corresponding value for each location. For example, if the customer's insurance amount is less than 100,000, the corresponding information field will be recorded as the value 1. If the customer's insurance amount is 100,000-500,000, the corresponding information field will be recorded as the value 2, and the customer's insurance amount will be 50-100,000. The corresponding information field is recorded as a value of 3 or the like.
  • step S104 it is a specific flowchart of the step S104.
  • the process includes the steps:
  • x i1 ⁇ x im corresponds to the m information fields of the client i
  • x j1 ⁇ x jm corresponds to the value of the m information fields of the client j.
  • the distance is used to reflect the similarity between the two customers, and the smaller the value of the calculated distance d ij , the more similar between the customer i and the customer j.
  • the distance d ij needs to be calculated between each two clients, so that the similarity between each two clients can be judged.
  • the threshold is recorded as d c , which is used to distinguish that each two clients are similar or not similar, and the condition to be satisfied is: statistically calculate the distance d ij between every two clients.
  • the value of d c is greater than or equal to 80% of all d ij values. For example, assuming that there are 100 d ij calculated for all customers, the threshold d c needs to be greater than or equal to the value of 80 d ij .
  • the distance d ij between two clients is less than the threshold d c , the two customers are considered to be similar; when the distance d ij between two clients is greater than or equal to the threshold d c , the two customers are considered Not very similar.
  • the local density formula is
  • the local density is used to reflect the number of other customers that are similar to the customer, and the greater the calculated local density, the greater the number of other customers that are similar to the customer.
  • step S106 it is a specific flowchart of the step S106.
  • the process includes the steps:
  • n customers will correspond to n local densities, and then the n local densities are sorted from largest to smallest.
  • the reference point refers to the standard that the customer is regarded as a classification category, that is, other customers who are similar to the customer as the reference point can be classified into the customer.
  • step S302. The process includes the steps:
  • three customers A, B, and C with the highest local density are selected as reference points.
  • the above customer A find all similar customers whose distance from the customer A is less than the threshold d c (ie find all customers similar to the customer A), and then find the customer A and the customer A Out of the customer is classified as the first category.
  • the above customer B find all similar customers whose distance from the customer B is less than the threshold d c (ie find all customers similar to the customer B), and then find the customer B with the found Customers are classified as the second category.
  • the above customer C find all similar customers whose distance from the customer C is less than the threshold d c (ie find all customers similar to the customer C), and then find the customer C with the found Customers are classified as the third category.
  • customer A and customers A 1 , A 2 , and A 3 are classified into the first category
  • customer B and customer B 1 are classified into the second category
  • customer C and customers C 1 and C 2 are classified into the third category
  • the remaining customers D and E are not classified. Therefore, the distance between the customer D and the reference point customers A, B, and C, and the distance between the customer E and the reference point customers A, B, and C are respectively calculated, assuming that the distance between the customer D and the customer B is the closest, the customer The closest distance between E and customer A is to classify customer D as the second category and customer E as the first category.
  • K customer categories are also obtained. For example, when selecting 3 customers with the highest local density as the reference point, all customers will be divided into 3 categories; when selecting 4 customers with the highest local density as the reference point, all customers will be divided into 4 categories. And so on. Therefore, it is necessary to determine the optimal value of the number of categories K according to a predetermined algorithm so that the corresponding classification is most reasonable.
  • all customers can be regarded as one domain U, wherein each customer is one sample (a total of n samples), and each sample corresponds to m attributes (ie, the information field), and the domain U All samples were divided into K categories. First, for the K customer categories, calculate the first distance and D 1 from the center of each customer category to the center of the entire domain, and then calculate each sample (customer) in the customer category for each customer category.
  • the ratio D 1 /D 3 is the optimum value of the number of customer categories corresponding to the maximum D 1 /D 3 ratio.
  • the center refers to averaging each attribute of the corresponding sample.
  • the customer category center is to average all the samples included in the customer category for each attribute.
  • the center of the entire domain is to average all the samples contained in the entire domain for each attribute.
  • the number of the categories is K 3
  • the number of categories K 2 corresponding to R 2 is taken as the optimal value. . That is to say, in the above case, it is most reasonable to divide all customers into K 2 categories.
  • the optimal value of the number K of the categories is 4, then the four customers with the highest local density are selected as the reference points, and all the customers are divided into four categories to complete the category of the customer. Division.
  • the customer classification method described in this embodiment can comprehensively and accurately divide all customers into different categories according to the nature of the customer, and optimize the number of categories to make the classification more reasonable and can be given to the industry.
  • the staff can provide an effective reference basis for product promotion, which is conducive to the precise marketing of business personnel.
  • a second embodiment of the present invention provides an electronic device.
  • the electronic device includes, but is not limited to, a memory 11, a processor 12, a network interface 13, and a display 14.
  • the electronic device may be a device with data processing functions such as a smart phone, a tablet computer, a notebook computer, a desktop computer machine, or the like.
  • the memory 11 includes a memory and at least one type of readable storage medium.
  • the memory provides a cache for the operation of the electronic device;
  • the readable storage medium may be a non-volatile storage medium such as a flash memory, a hard disk, a multimedia card, a card type memory, or the like.
  • the readable storage medium can be an internal storage unit of the electronic device, such as a hard disk or memory of the electronic device.
  • the readable storage medium may also be an external storage device of the electronic device, such as a plug-in hard disk equipped on the electronic device, a smart memory card (SMC), and secure. Digital (Secure Digital, SD) card, flash card (Flash Card), etc.
  • the readable storage medium of the memory 11 is generally used to store application software and various types of data installed on the electronic device, such as the client classification program 500.
  • the memory 11 can also be used to temporarily store data that has been output or is about to be output.
  • the processor 12 in some embodiments, may be a Central Processing Unit (CPU), microprocessor or other data processing chip for running program code or processing data stored in the memory 11.
  • the processor 12 executes a client categorization program 500 that implements any of the steps of the customer classification method described above.
  • the network interface 13 may include a standard wired interface, a wireless interface (such as a WI-FI interface).
  • the display 14 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch sensor, or the like in some embodiments.
  • the display 14 is used to display information processed in the electronic device, a user interface for displaying visualizations, and the like.
  • FIG. 5 shows only the electronic device with components 11-14, but it should be understood that not all illustrated components may be implemented and that more or fewer components may be implemented instead.
  • the electronic device may further include a user interface
  • the user interface may include an input unit such as a keyboard
  • the optional user interface may further include a standard wired interface and a wireless interface.
  • the customer classification program 500 can be divided into an acquisition module 502, a screening module 504, a calculation module 506, and a classification module 508.
  • the processor 12 executes the computer program instructions of each module, any of the steps of the customer classification method described above can be implemented based on the operations and functions that can be implemented by the various computer program instructions. The following description will specifically describe the operations and functions implemented by the acquisition module 502, the screening module 504, the calculation module 506, and the classification module 508.
  • the obtaining module 502 is configured to obtain information of all customers.
  • the obtaining module 502 acquires related information of all customers that need to perform classification statistics, where the number of the clients is n (n is a positive integer).
  • the screening module 504 is configured to filter a preset information field from information of each client.
  • m reference fields may be preset as a basis for classifying customers. That is, each customer includes m valid information fields, such as the region where the customer is located, the nature of the customer's unit, the customer's previous purchase insurance liability, insurance amount, premium and claims information.
  • the contents of the m information fields can be converted into corresponding values, so as to calculate the distance between the customers, thereby determining the similarity between the customers. For example, if the customer's location is Beijing, the corresponding information field will be recorded as the value 1. If the customer's location is Shanghai, the corresponding information field will be recorded as the value 2, etc., according to the geographical location of the customer's location or the size of the city. Set the corresponding value for each location. For example, if the customer's insurance amount is less than 100,000, the corresponding information field will be recorded as the value 1. If the customer's insurance amount is 100,000-500,000, the corresponding information field will be recorded as the value 2, and the customer's insurance amount will be 50-100,000. The corresponding information field is recorded as a value of 3 or the like.
  • the calculation module 506 is configured to establish a density-based clustering algorithm model, and calculate a local density corresponding to each customer according to the filtered information field.
  • the calculation module 506 first evaluates the distance between the two customers based on the Euclidean distance formula.
  • the Euclidean distance formula is
  • x i1 ⁇ x im corresponds to the m information fields of the client i
  • x j1 ⁇ x jm corresponds to the value of the m information fields of the client j.
  • the distance is used to reflect the similarity between the two customers, and the smaller the value of the calculated distance d ij , the more similar between the customer i and the customer j.
  • the distance d ij needs to be calculated between each two clients, so that the similarity between each two clients can be judged.
  • the calculation module 506 sets a threshold for distinguishing the similarity of the customer.
  • the threshold is recorded as d c , which is used to distinguish that each two clients are similar or not similar, and the condition to be satisfied is: statistically calculate the distance d ij between every two clients.
  • the value of d c is greater than or equal to 80% of all d ij values. For example, assuming that there are 100 d ij calculated for all customers, the threshold d c needs to be greater than or equal to the value of 80 d ij .
  • the two customers are considered to be similar; when the distance d ij between two clients is greater than or equal to the threshold d c , the two customers are considered Not very similar.
  • the calculation module 506 calculates a local density corresponding to each customer based on the threshold and the local density formula.
  • the local density formula is
  • the local density is used to reflect the number of other customers that are similar to the customer, and the greater the calculated local density, the greater the number of other customers that are similar to the customer.
  • the classification module 508 is configured to divide all customers into different categories according to the calculation result.
  • the classification module 508 first sorts the calculated local densities from largest to smallest. For each A customer will calculate a corresponding local density, that is, n customers will correspond to n local densities, and then sort the n local densities from large to small.
  • the classification module 508 divides all customers into K categories (0 ⁇ K ⁇ n) with the K customers having the highest local density as reference points. Specifically include:
  • the reference point refers to the standard that the customer is regarded as a classification category, that is, other customers who are similar to the customer as the reference point can be classified into the customer.
  • the K reference points are respectively classified into similar categories with similar customers whose distance is less than the threshold. For example, for the above customer A, find all similar customers whose distance from the customer A is less than the threshold d c (ie find all customers similar to the customer A), and then find the customer A and the customer A Out of the customer is classified as the first category. For the above customer B, find all similar customers whose distance from the customer B is less than the threshold d c (ie find all customers similar to the customer B), and then find the customer B with the found Customers are classified as the second category. For the above customer C, find all similar customers whose distance from the customer C is less than the threshold d c (ie find all customers similar to the customer C), and then find the customer C with the found Customers are classified as the third category.
  • the classification module 508 determines the optimal value of the number of categories K. Specifically, when the number of customers K selected as the reference point is different, different K customer categories are also obtained. For example, when selecting 3 customers with the highest local density as the reference point, all customers will be divided into 3 categories; when selecting 4 customers with the highest local density as the reference point, all customers will be divided into 4 categories. And so on. Therefore, it is necessary to determine the optimal value of the number of categories K according to a predetermined algorithm so that the corresponding classification is most reasonable.
  • all customers can be regarded as one domain U, wherein each customer is one sample (a total of n samples), and each sample corresponds to m attributes (ie, the information field), and the domain U All samples were divided into K categories. First, for the K customer categories, calculate the first distance and D 1 from the center of each customer category to the center of the entire domain, and then calculate each sample (customer) in the customer category for each customer category.
  • the ratio D 1 /D 3 is the optimum value of the number of customer categories corresponding to the maximum D 1 /D 3 ratio.
  • the center refers to averaging each attribute of the corresponding sample.
  • the customer category center is to average all the samples included in the customer category for each attribute.
  • the center of the entire domain is to average all the samples contained in the entire domain for each attribute.
  • the number of the categories is K 3
  • the number of categories K 2 corresponding to R 2 is taken as the optimal value. . That is to say, in the above case, it is most reasonable to divide all customers into K 2 categories.
  • the classification module 508 completes the category division for all customers according to the determined number of best categories. For example, if it is determined that the optimal value of the number K of the categories is 4, then the four customers with the highest local density are selected as the reference points, and all the customers are divided into four categories to complete the category of the customer. Division.
  • the third embodiment of the present invention further provides a computer readable storage medium having a client classification program stored thereon.
  • client classification program When the client classification program is executed by the processor, the following steps can be implemented:
  • the preset information field includes the area where the customer is located, the nature of the unit of the customer, the customer's previous purchase insurance liability, the insurance amount, the premium and the claim information, and the content of each information field corresponds to a value.
  • the step of establishing a density-based clustering algorithm model, and calculating a local density corresponding to each client according to the filtered information field specifically includes:
  • the local density corresponding to each customer is calculated according to the threshold d c and the local density formula.
  • the Euclidean distance formula is
  • x i1 to x im corresponds to the value of the m information fields of the client i
  • x j1 to x jm correspond to the values of the m information fields of the client j.
  • the threshold d c is satisfied by a statistically calculated value of the distance d ij between every two clients, and the value of d c is greater than or equal to 80% of all d ij .
  • the local density formula is
  • the step of dividing all customers into different categories according to the calculation result specifically includes:
  • the classification of all customers is completed according to the determined number of best categories.
  • the step of dividing all customers into K categories by using K customers with the highest local density as a reference point specifically includes:
  • K reference points are classified into similar categories with similar customers whose distance is less than the threshold;
  • the distance between each remaining customer and the K reference points is calculated separately, and the remaining customers are classified into the closest reference points.
  • the step of determining an optimal value of the number K of the categories specifically includes:
  • the number of categories K corresponding to the maximum value is taken as the optimum value.
  • the foregoing embodiment method can be implemented by means of software plus a necessary general hardware platform, and can also be implemented by hardware, but in many cases, the former is A better implementation.
  • the technical solution of the present invention which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk,
  • the optical disc includes a number of instructions for causing a terminal device (which may be a cell phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the methods described in various embodiments of the present invention.

Abstract

一种客户分类方法、电子装置及存储介质,该方法包括:获取所有客户的信息(S100);从每个客户的信息中筛选预设的信息字段(S102);建立基于密度的聚类算法模型,根据所筛选的信息字段计算每个客户对应的局部密度(S104);根据计算出的局部密度将所有客户划分为不同的类别(S106)。由此可以准确全面地对客户进行分类,为产品推广提供有效的参考依据。

Description

客户分类方法、电子装置及存储介质
优先权申明
本申请基于巴黎公约申明享有2016年11月15日递交的申请号为CN201611005111.7、名称为“客户分类方法及系统”的中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。
技术领域
本发明涉及数据处理技术领域,尤其涉及客户分类方法、电子装置及存储介质。
背景技术
在保险行业中,通常需要对投保的客户进行分类统计,以方便业务人员根据客户类别做出不同的营销策略。但是,现有的对客户进行分类的方式还停留在依据年龄、保额、保费等数据直接划分的阶段。该方式的评估条件少、结果准确性不高,无法挖掘出数据内部更深层次的信息,因而无法给业务人员做产品推广提供有效的参考依据。
发明内容
有鉴于此,本发明的目的在于提供一种客户分类方法、电子装置及存储介质,以解决如何准确全面地对客户进行分类的问题。
为实现上述目的,本发明提供一种客户分类方法,该方法包括步骤:
获取所有客户的信息;
从每个客户的信息中筛选预设的信息字段;
建立基于密度的聚类算法模型,根据所筛选的信息字段计算每个客户对应的局部密度;及
根据计算出的局部密度将所有客户划分为不同的类别。
为实现上述目的,本发明还提出一种电子装置,该电子装置包括:存储器、处理器及显示器。该存储器中存储有客户分类程序,该客户分类程序被该处理器执行时,可实现如下步骤:
获取所有客户的信息;
从每个客户的信息中筛选预设的信息字段;
建立基于密度的聚类算法模型,根据所筛选的信息字段计算每个客户对应的局部密度;及
根据计算出的局部密度将所有客户划分为不同的类别。
另外,本发明还提出一种计算机可读存储介质,该计算机可读存储介质上存储有客户分类程序,该客户分类程序被该处理器执行时,可实现上述客户分类方法的任一步骤。
本发明的有益效果在于,相较于现有技术,本发明提出的客户分类方法、电子装置及存储介质,可以根据客户性质全面而准确地将所有客户划分为不同的类别,且对类别个数进行了优化,使分类更加合理,能够给业务人员做产品推广提供有效的参考依据,有利于业务人员精准营销。
附图说明
图1为本发明第一实施例提出的一种客户分类方法的流程图;
图2为图1中步骤S104的具体流程图;
图3为图1中步骤S106的具体流程图;
图4为图3中步骤S302的具体流程图;
图5为本发明第二实施例提出的一种电子装置的示意图;
图6为图5中客户分类程序的模块示意图。
本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
为了使本发明所要解决的技术问题、技术方案及有益效果更加清楚、明白,以下结合附图和实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
第一实施例
如图1所示,本发明第一实施例提出一种客户分类方法,该方法包括以下步骤:
S100,获取所有客户的信息。
具体地,获取所有需要进行分类统计的客户的相关信息,其中,所述客户的个数为n(n为正整数)。
S102,从每个客户的信息中筛选预设的信息字段。
具体地,可以预设m个有参考价值的信息字段(m为正整数),以作为对客户进行分类的依据。即每个客户包括m个有效的信息字段,例如客户所在的地区、客户所在单位性质、客户以往购买险种责任、保额、保费及理赔信息等。
在本实施例中,所述m个信息字段的内容均可以转换为相应的数值,以便后续计算客户之间的距离,从而判断客户之间的相似度。例如,客户所在的地区为北京则将相应信息字段记为数值1,客户所在地为上海则将相应信息字段记为数值2等,可以根据客户所在地的地理位置远近或者城市大小等设定条件来为每种所在地设置对应的数值。又如,客户的保额为10万以下则将相应信息字段记为数值1,客户的保额为10-50万则将相应信息字段记为数值2,客户的保额为50-100万则将相应信息字段记为数值3等。
S104,建立基于密度的聚类算法模型,根据所筛选的信息字段计算每个客户对应的局部密度。
具体地,参阅图2所示,为所述步骤S104的具体流程图。该流程包括步骤:
S200,根据欧氏距离公式评估两个客户之间的距离。
在本实施例中,所述欧氏距离公式为
Figure PCTCN2017091365-appb-000001
其中dij为客户i(i=1,2,…,n)与客户j(j=1,2,…,n)之间的距离,xi1~xim对应客户i的m个信息字段的数值,xj1~xjm对应客户j的m个信息字段的数值。所述距离用于反映两个客户之间的相似度,所计算出的距离dij的值越小,表示客户i与客户j之间越相似。
在本实施例中,针对所述n个客户,其中每两个客户之间都需要计算所述距离dij,从而可以判断每两个客户之间的相似度。
S202,设置用于区分客户相似度的阈值。
在本实施例中,所述阈值记为dc,用于区分每两个客户之间比较相似或者不太相似,需要满足的条件是:统计计算出的每两个客户之间的距离dij的值,dc值大于等于所有dij中80%的值。例如,假设针对所有客户计算出的dij共有100个,则所述阈值dc需要大于等于其中80个dij的值。当两个客户之间的距离dij小于所述阈值dc时,认为该两个客户比较相似;当两个客户之间的距离dij大于等于所述阈值dc时,认为该两个客户不太相似。
S204,根据所述阈值和局部密度公式计算每个客户对应的局部密度。
在本实施例中,所述局部密度公式为
Figure PCTCN2017091365-appb-000002
其中
Figure PCTCN2017091365-appb-000003
所述局部密度用于反映与该客户比较相似的其他客户的数量多少,当计算出的局部密度越大,表示与该客户比较相似的其他客户的数量越多。
回到图1,S106,根据计算结果将所有客户划分为不同的类别。
具体地,参阅图3所示,为所述步骤S106的具体流程图。该流程包括步骤:
S300,将计算出的局部密度按从大到小排序。
具体地,针对每一个客户,都会计算出一个对应的局部密度,即n个客户将对应n个局部密度,然后将该n个局部密度按从大到小排序。
S302,以局部密度最大的K个客户为参照点将所有客户划分为K个类别(0<K<n)。所述参照点是指将该客户当作划分类别的标准,即与该作为参照点的客户比较相似的其他客户可与该客户归为一类。
具体地,参阅图4所示,为所述步骤S302的具体流程图。该流程包括步骤:
S400,根据所述排序选择局部密度最大的K个客户作为参照点。
例如,选择局部密度最大的3个客户A、B、C作为参照点。
S402,分别将该K个参照点与距离小于所述阈值的相似客户归为一类。
例如,针对上述客户A,找出与该客户A之间的距离小于所述阈值dc的所有相似客户(即找出所有与该客户A比较相似的客户),然后将该客户A与所 找出的客户归为第一类别。针对上述客户B,找出与该客户B之间的距离小于所述阈值dc的所有相似客户(即找出所有与该客户B比较相似的客户),然后将该客户B与所找出的客户归为第二类别。针对上述客户C,找出与该客户C之间的距离小于所述阈值dc的所有相似客户(即找出所有与该客户C比较相似的客户),然后将该客户C与所找出的客户归为第三类别。
S404,针对所述归类后剩余的客户,分别计算每个客户与该K个参照点之间的距离,将该客户与距离最近的参照点归为一类。
例如,假设客户A与客户A1、A2、A3归为第一类别,客户B与客户B1归为第二类别,客户C与客户C1、C2归为第三类别,另外还剩余客户D、E没有被归类。因此,分别计算客户D与参照点客户A、B、C之间的距离,以及客户E与参照点客户A、B、C之间的距离,假设客户D与客户B之间的距离最近,客户E与客户A之间的距离最近,则将客户D归为第二类别,将客户E归为第一类别。
回到图3,S304,判断所述类别个数K的最佳值。
具体地,当选作参照点的客户个数K不相同时,也会得到不同的K个客户类别。例如,当选择局部密度最大的3个客户作为参照点时,所有客户将被划分为3个类别;当选择局部密度最大的4个客户作为参照点时,所有客户将被划分为4个类别,以此类推。因此,需要根据预定的算法来判断出所述类别个数K的最佳值,以使对应的分类最合理。
在本实施例中,可以将所有客户看作一个域U,其中每个客户为一个样本(共n个样本),每个样本对应m个属性(即所述信息字段),该域U中的所有样本被划分为K个类别。首先针对K个客户类别,计算出每个客户类别的中心到整个域的中心的第一距离和D1,然后针对每一个客户类别,分别计算该客户类别中的每个样本(客户)到该客户类别中心的第二距离和D2,并计算所有K个客户类别对应的所述第二距离和的总和,记为第三距离和D3,最后计算所述第一距离和与第三距离和之比D1/D3,将D1/D3比值最大时对应的客户类别个数K作为最佳值。其中所述中心是指将对应的样本的每个属性取平均值。例如客户类别中心即是将该客户类别中包含的所有样本针对每个属性取平均值,整个域的中心即是将整个域中包含的所有样本针对每个属性取平均值。
例如,假设当所述类别个数为K1时,计算出对应的D1/D3=R1;当所述类别个数为K2时,计算出对应的D1/D3=R2;当所述类别个数为K3时,计算出对应的D1/D3=R3,并且R2>R3>R1,则将R2对应的类别个数K2作为最佳值。也就是说,在上述情况下,将所有客户划分为K2个类别最为合理。
S306,按照判断出的最佳类别个数完成对所有客户的类别划分。
例如,假设判断出所述类别个数K的最佳值为4,则按照上述选择局部密度最大的4个客户作为参照点,将所有客户划分为4个类别的方式,完成对有客户的类别划分。
本实施例所述的客户分类方法,可以根据客户性质全面而准确地将所有客户划分为不同的类别,且对类别个数进行了优化,使分类更加合理,能够给业 务人员做产品推广提供有效的参考依据,有利于业务人员精准营销。
第二实施例
如图5所示,本发明第二实施例提出一种电子装置。该电子装置包括,但不仅限于,存储器11、处理器12、网络接口13及显示器14。
其中,所述电子装置可以是智能手机、平板电脑、笔记本、桌上型计算机机等具有数据处理功能的设备。
存储器11包括内存及至少一种类型的可读存储介质。内存为电子装置的运行提供缓存;可读存储介质可为如闪存、硬盘、多媒体卡、卡型存储器等的非易失性存储介质。在一些实施例中,所述可读存储介质可以是所述电子装置的内部存储单元,例如该电子装置的硬盘或内存。在另一些实施例中,所述可读存储介质也可以是所述电子装置的外部存储设备,例如所述电子装置上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。
本实施例中,所述存储器11的可读存储介质通常用于存储安装于所述电子装置的应用软件及各类数据,例如客户分类程序500等。所述存储器11还可以用于暂时地存储已经输出或者将要输出的数据。
处理器12在一些实施例中可以是一中央处理器(Central Processing Unit,CPU),微处理器或其他数据处理芯片,用于运行所述存储器11中存储的程序代码或处理数据。该处理器12执行客户分类程序500,可实现上述客户分类方法的任一步骤。
网络接口13可以包括标准的有线接口、无线接口(如WI-FI接口)。
显示器14在一些实施例中可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。所述显示器14用于显示在所述电子装置中处理的信息以及用于显示可视化的用户界面等。
图5仅示出了具有组件11-14的电子装置,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。
可选的,该电子装置还可以包括用户接口,用户接口可以包括输入单元比如键盘(Keyboard)等,可选的用户接口还可以包括标准的有线接口、无线接口。
在本实施例中,如图6所示,所述的客户分类程序500可以被分割成获取模块502、筛选模块504、计算模块506及分类模块508。当处理器12执行各模块的计算机程序指令段时,基于各个计算机程序指令段所能实现的操作和功能,可实现上述客户分类方法的任一步骤。以下描述将具体介绍所述获取模块502、筛选模块504、计算模块506及分类模块508所实现的操作和功能。
所述获取模块502,用于获取所有客户的信息。
具体地,获取模块502获取所有需要进行分类统计的客户的相关信息,其中,所述客户的个数为n(n为正整数)。
所述筛选模块504,用于从每个客户的信息中筛选预设的信息字段。
具体地,可以预设m个有参考价值的信息字段(m为正整数),以作为对客户进行分类的依据。即每个客户包括m个有效的信息字段,例如客户所在的地区、客户所在单位性质、客户以往购买险种责任、保额、保费及理赔信息等。
在本实施例中,所述m个信息字段的内容均可以转换为相应的数值,以便后续计算客户之间的距离,从而判断客户之间的相似度。例如,客户所在的地区为北京则将相应信息字段记为数值1,客户所在地为上海则将相应信息字段记为数值2等,可以根据客户所在地的地理位置远近或者城市大小等设定条件来为每种所在地设置对应的数值。又如,客户的保额为10万以下则将相应信息字段记为数值1,客户的保额为10-50万则将相应信息字段记为数值2,客户的保额为50-100万则将相应信息字段记为数值3等。
所述计算模块506,用于建立基于密度的聚类算法模型,根据所筛选的信息字段计算每个客户对应的局部密度。
具体地,计算模块506首先根据欧氏距离公式评估两个客户之间的距离。在本实施例中,所述欧氏距离公式为
Figure PCTCN2017091365-appb-000004
其中dij为客户i(i=1,2,…,n)与客户j(j=1,2,…,n)之间的距离,xi1~xim对应客户i的m个信息字段的数值,xj1~xjm对应客户j的m个信息字段的数值。所述距离用于反映两个客户之间的相似度,所计算出的距离dij的值越小,表示客户i与客户j之间越相似。
在本实施例中,针对所述n个客户,其中每两个客户之间都需要计算所述距离dij,从而可以判断每两个客户之间的相似度。
计算模块506设置用于区分客户相似度的阈值。在本实施例中,所述阈值记为dc,用于区分每两个客户之间比较相似或者不太相似,需要满足的条件是:统计计算出的每两个客户之间的距离dij的值,dc值大于等于所有dij中80%的值。例如,假设针对所有客户计算出的dij共有100个,则所述阈值dc需要大于等于其中80个dij的值。当两个客户之间的距离dij小于所述阈值dc时,认为该两个客户比较相似;当两个客户之间的距离dij大于等于所述阈值dc时,认为该两个客户不太相似。
计算模块506根据所述阈值和局部密度公式计算每个客户对应的局部密度。在本实施例中,所述局部密度公式为
Figure PCTCN2017091365-appb-000005
其中
Figure PCTCN2017091365-appb-000006
所述局部密度用于反映与该客户比较相似的其他客户的数量多少,当计算出的局部密度越大,表示与该客户比较相似的其他客户的数量越多。
所述分类模块508,用于根据计算结果将所有客户划分为不同的类别。
具体地,分类模块508首先将计算出的局部密度按从大到小排序。针对每 一个客户,都会计算出一个对应的局部密度,即n个客户将对应n个局部密度,然后将该n个局部密度按从大到小排序。
然后,分类模块508以局部密度最大的K个客户为参照点将所有客户划分为K个类别(0<K<n)。具体包括:
(1)根据所述排序选择局部密度最大的K个客户作为参照点。例如,选择局部密度最大的3个客户A、B、C作为参照点。所述参照点是指将该客户当作划分类别的标准,即与该作为参照点的客户比较相似的其他客户可与该客户归为一类。
(2)分别将该K个参照点与距离小于所述阈值的相似客户归为一类。例如,针对上述客户A,找出与该客户A之间的距离小于所述阈值dc的所有相似客户(即找出所有与该客户A比较相似的客户),然后将该客户A与所找出的客户归为第一类别。针对上述客户B,找出与该客户B之间的距离小于所述阈值dc的所有相似客户(即找出所有与该客户B比较相似的客户),然后将该客户B与所找出的客户归为第二类别。针对上述客户C,找出与该客户C之间的距离小于所述阈值dc的所有相似客户(即找出所有与该客户C比较相似的客户),然后将该客户C与所找出的客户归为第三类别。
(3)针对所述归类后剩余的客户,分别计算每个客户与该K个参照点之间的距离,将该客户与距离最近的参照点归为一类。例如,假设客户A与客户A1、A2、A3归为第一类别,客户B与客户B1归为第二类别,客户C与客户C1、C2归为第三类别,另外还剩余客户D、E没有被归类。因此,分别计算客户D与参照点客户A、B、C之间的距离,以及客户E与参照点客户A、B、C之间的距离,假设客户D与客户B之间的距离最近,客户E与客户A之间的距离最近,则将客户D归为第二类别,将客户E归为第一类别。
接着,分类模块508判断所述类别个数K的最佳值。具体地,当选作参照点的客户个数K不相同时,也会得到不同的K个客户类别。例如,当选择局部密度最大的3个客户作为参照点时,所有客户将被划分为3个类别;当选择局部密度最大的4个客户作为参照点时,所有客户将被划分为4个类别,以此类推。因此,需要根据预定的算法来判断出所述类别个数K的最佳值,以使对应的分类最合理。
在本实施例中,可以将所有客户看作一个域U,其中每个客户为一个样本(共n个样本),每个样本对应m个属性(即所述信息字段),该域U中的所有样本被划分为K个类别。首先针对K个客户类别,计算出每个客户类别的中心到整个域的中心的第一距离和D1,然后针对每一个客户类别,分别计算该客户类别中的每个样本(客户)到该客户类别中心的第二距离和D2,并计算所有K个客户类别对应的所述第二距离和的总和,记为第三距离和D3,最后计算所述第一距离和与第三距离和之比D1/D3,将D1/D3比值最大时对应的客户类别个数K作为最佳值。其中所述中心是指将对应的样本的每个属性取平均值。例如客户类别中心即是将该客户类别中包含的所有样本针对每个属性取平均值,整个域的中心即是将整个域中包含的所有样本针对每个属性取平均值。
例如,假设当所述类别个数为K1时,计算出对应的D1/D3=R1;当所述类别个数为K2时,计算出对应的D1/D3=R2;当所述类别个数为K3时,计算出对应的D1/D3=R3,并且R2>R3>R1,则将R2对应的类别个数K2作为最佳值。也就是说,在上述情况下,将所有客户划分为K2个类别最为合理。
最后,分类模块508按照判断出的最佳类别个数完成对所有客户的类别划分。例如,假设判断出所述类别个数K的最佳值为4,则按照上述选择局部密度最大的4个客户作为参照点,将所有客户划分为4个类别的方式,完成对有客户的类别划分。
第三实施例
本发明第三实施例还提出一种计算机可读存储介质,该计算机可读存储介质上存储有客户分类程序,该客户分类程序被该处理器执行时,可实现如下步骤:
获取所有客户的信息;
从每个客户的信息中筛选预设的信息字段;
建立基于密度的聚类算法模型,根据所筛选的信息字段计算每个客户对应的局部密度;及
根据计算出的局部密度将所有客户划分为不同的类别。
优选地,所述预设的信息字段包括客户所在的地区、客户所在单位性质、客户以往购买险种责任、保额、保费及理赔信息,每个信息字段的内容均对应于一数值。
优选地,所述建立基于密度的聚类算法模型,根据所筛选的信息字段计算每个客户对应的局部密度的步骤具体包括:
根据欧氏距离公式评估两个客户之间的距离;
设置用于区分客户相似度的阈值dc
根据所述阈值dc和局部密度公式计算每个客户对应的局部密度。
优选地,所述欧氏距离公式为
Figure PCTCN2017091365-appb-000007
其中dij为客户i与客户j之间的距离,xi1~xim对应客户i的m个信息字段的数值,xj1~xjm对应客户j的m个信息字段的数值。
优选地,所述阈值dc满足的条件为:统计计算出的每两个客户之间的距离dij的值,dc的值大于等于所有dij中80%的值。
优选地,所述局部密度公式为
Figure PCTCN2017091365-appb-000008
其中
Figure PCTCN2017091365-appb-000009
优选地,所述根据计算结果将所有客户划分为不同的类别的步骤具体包括:
将计算出的局部密度按从大到小排序;
以局部密度最大的K个客户为参照点将所有客户划分为K个类别;
判断所述类别个数K的最佳值;
按照判断出的最佳类别个数完成对所有客户的类别划分。
优选地,所述以局部密度最大的K个客户为参照点将所有客户划分为K个类别的步骤具体包括:
根据所述排序选择局部密度最大的K个客户作为参照点;
分别将K个参照点与距离小于所述阈值的相似客户归为一类;
针对归类后剩余的客户,分别计算每个剩余的客户与所述K个参照点之间的距离,将所述剩余的客户与距离最近的参照点归为一类。
优选地,所述判断所述类别个数K的最佳值的步骤具体包括:
将所有客户看作一个域,其中每个客户为一个样本;
针对所述K个类别,计算出每个类别的中心到整个域的中心的第一距离和;
针对每一个类别,分别计算该类别中的每个样本到该类别中心的第二距离和;
计算所有K个类别对应的所述第二距离和的总和,记为第三距离和;
计算所述第一距离和与第三距离和之比;
将比值最大时对应的类别个数K作为最佳值。
本发明之计算机可读存储介质的具体实施方式与上述客户分类方法的实施例大致相同,故不再赘述。
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件来实现,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本发明各个实施例所述的方法。
以上参照附图说明了本发明的优选实施例,并非因此局限本发明的权利范围。上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。另外,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
本领域技术人员不脱离本发明的范围和实质,可以有多种变型方案实现本 发明,比如作为一个实施例的特征可用于另一实施例而得到又一实施例。凡在运用本发明的技术构思之内所作的任何修改、等同替换和改进,均应在本发明的权利范围之内。

Claims (20)

  1. 一种客户分类方法,其特征在于,该方法包括步骤:
    获取所有客户的信息;
    从每个客户的信息中筛选预设的信息字段;
    建立基于密度的聚类算法模型,根据所筛选的信息字段计算每个客户对应的局部密度;及
    根据计算出的局部密度将所有客户划分为不同的类别。
  2. 根据权利要求1所述的客户分类方法,其特征在于,所述预设的信息字段包括客户所在的地区、客户所在单位性质、客户以往购买险种责任、保额、保费及理赔信息,每个信息字段的内容均对应于一数值。
  3. 根据权利要求1所述的客户分类方法,其特征在于,所述建立基于密度的聚类算法模型,根据所筛选的信息字段计算每个客户对应的局部密度的步骤具体包括:
    根据欧氏距离公式评估两个客户之间的距离;
    设置用于区分客户相似度的阈值dc
    根据所述阈值dc和局部密度公式计算每个客户对应的局部密度。
  4. 根据权利要求3所述的客户分类方法,其特征在于,所述欧氏距离公式为
    Figure PCTCN2017091365-appb-100001
    其中dij为客户i与客户j之间的距离,xi1~xim对应客户i的m个信息字段的数值,xj1~xjm对应客户j的m个信息字段的数值。
  5. 根据权利要求4所述的客户分类方法,其特征在于,所述阈值dc满足的条件为:统计计算出的每两个客户之间的距离dij的值,dc的值大于等于所有dij中80%的值。
  6. 根据权利要求4所述的客户分类方法,其特征在于,所述局部密度公式为
    Figure PCTCN2017091365-appb-100002
    其中
    Figure PCTCN2017091365-appb-100003
  7. 根据权利要求5所述的客户分类方法,其特征在于,所述局部密度公式为
    Figure PCTCN2017091365-appb-100004
    其中
    Figure PCTCN2017091365-appb-100005
  8. 根据权利要求1所述的客户分类方法,其特征在于,所述根据计算结果将所有客户划分为不同的类别的步骤具体包括:
    将计算出的局部密度按从大到小排序;
    以局部密度最大的K个客户为参照点将所有客户划分为K个类别;
    判断所述类别个数K的最佳值;
    按照判断出的最佳类别个数完成对所有客户的类别划分。
  9. 根据权利要求8所述的客户分类方法,其特征在于,所述以局部密度最大的K个客户为参照点将所有客户划分为K个类别的步骤具体包括:
    根据所述排序选择局部密度最大的K个客户作为参照点;
    分别将K个参照点与距离小于所述阈值的相似客户归为一类;
    针对归类后剩余的客户,分别计算每个剩余的客户与所述K个参照点之间的距离,将所述剩余的客户与距离最近的参照点归为一类。
  10. 根据权利要求8所述的客户分类方法,其特征在于,所述判断所述类别个数K的最佳值的步骤具体包括:
    将所有客户看作一个域,其中每个客户为一个样本;
    针对所述K个类别,计算出每个类别的中心到整个域的中心的第一距离和;
    针对每一个类别,分别计算该类别中的每个样本到该类别中心的第二距离和;
    计算所有K个类别对应的所述第二距离和的总和,记为第三距离和;
    计算所述第一距离和与第三距离和之比;
    将比值最大时对应的类别个数K作为最佳值。
  11. 一种电子装置,其特征在于,该电子装置包括:存储器、处理器及显示器。该存储器中存储有客户分类程序,该客户分类程序被该处理器执行时,可实现如下步骤:
    获取所有客户的信息;
    从每个客户的信息中筛选预设的信息字段;
    建立基于密度的聚类算法模型,根据所筛选的信息字段计算每个客户对应的局部密度;及
    根据计算出的局部密度将所有客户划分为不同的类别。
  12. 根据权利要求11所述的电子装置,其特征在于,所述预设的信息字段包括客户所在的地区、客户所在单位性质、客户以往购买险种责任、保额、保费及理赔信息,每个信息字段的内容均对应于一数值。
  13. 根据权利要求11所述的电子装置,其特征在于,所述建立基于密度的聚类算法模型,根据所筛选的信息字段计算每个客户对应的局部密度的步骤具体包括:
    根据欧氏距离公式评估两个客户之间的距离;
    设置用于区分客户相似度的阈值dc
    根据所述阈值dc和局部密度公式计算每个客户对应的局部密度。
  14. 根据权利要求13所述的电子装置,其特征在于,所述欧氏距离公式为
    Figure PCTCN2017091365-appb-100006
    其中dij为客户i与客户j之间的距离,xi1~xim对应客户i的m个信息字段的数值,xj1~xjm对应客户j的m个信息字段的数值。
  15. 根据权利要求14所述的电子装置,其特征在于,所述阈值dc满足的条件为:统计计算出的每两个客户之间的距离dij的值,dc的值大于等于所有dij中80%的值。
  16. 根据权利要求14所述的电子装置,其特征在于,所述局部密度公式为
    Figure PCTCN2017091365-appb-100007
    其中
    Figure PCTCN2017091365-appb-100008
  17. 根据权利要求11所述的电子装置,其特征在于,所述根据计算结果将所有客户划分为不同的类别的步骤具体包括:
    将计算出的局部密度按从大到小排序;
    以局部密度最大的K个客户为参照点将所有客户划分为K个类别;
    判断所述类别个数K的最佳值;
    按照判断出的最佳类别个数完成对所有客户的类别划分。
  18. 根据权利要求17所述的电子装置,其特征在于,所述以局部密度最大的K个客户为参照点将所有客户划分为K个类别的步骤具体包括:
    根据所述排序选择局部密度最大的K个客户作为参照点;
    分别将K个参照点与距离小于所述阈值的相似客户归为一类;
    针对归类后剩余的客户,分别计算每个剩余的客户与所述K个参照点之间的距离,将所述剩余的客户与距离最近的参照点归为一类。
  19. 根据权利要求17所述的电子装置,其特征在于,所述判断所述类别个数K的最佳值的步骤具体包括:
    将所有客户看作一个域,其中每个客户为一个样本;
    针对所述K个类别,计算出每个类别的中心到整个域的中心的第一距离和;
    针对每一个类别,分别计算该类别中的每个样本到该类别中心的第二距离和;
    计算所有K个类别对应的所述第二距离和的总和,记为第三距离和;
    计算所述第一距离和与第三距离和之比;
    将比值最大时对应的类别个数K作为最佳值。
  20. 一种计算机可读存储介质,其特征在于,该计算机可读存储介质上存储有客户分类程序,该客户分类程序被该处理器执行时,可实现如权利要求1-10所述的客户分类方法的任一步骤。
PCT/CN2017/091365 2016-11-15 2017-06-30 客户分类方法、电子装置及存储介质 WO2018090643A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611005111.7A CN107194815B (zh) 2016-11-15 2016-11-15 客户分类方法及系统
CN201611005111.7 2016-11-15

Publications (1)

Publication Number Publication Date
WO2018090643A1 true WO2018090643A1 (zh) 2018-05-24

Family

ID=59871619

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/091365 WO2018090643A1 (zh) 2016-11-15 2017-06-30 客户分类方法、电子装置及存储介质

Country Status (2)

Country Link
CN (1) CN107194815B (zh)
WO (1) WO2018090643A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108153824B (zh) * 2017-12-06 2020-04-24 阿里巴巴集团控股有限公司 目标用户群体的确定方法及装置
CN108985950B (zh) * 2018-07-13 2023-04-18 平安科技(深圳)有限公司 电子装置、用户骗保风险预警方法及存储介质
CN109670852A (zh) * 2018-09-26 2019-04-23 平安普惠企业管理有限公司 用户分类方法、装置、终端及存储介质
CN113094615B (zh) * 2019-12-23 2024-03-01 中国石油天然气股份有限公司 消息推送方法、装置、设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020078044A1 (en) * 2000-12-19 2002-06-20 Jong-Cheol Song System for automatically classifying documents by category learning using a genetic algorithm and a term cluster and method thereof
CN102664961A (zh) * 2012-05-04 2012-09-12 北京邮电大学 MapReduce环境下的异常检测方法
CN103559630A (zh) * 2013-10-31 2014-02-05 华南师范大学 一种基于客户属性及行为特征分析的客户细分方法
US20140122401A1 (en) * 2012-10-29 2014-05-01 Sas Institute Inc. System and Method for Combining Segmentation Data

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101420313B (zh) * 2007-10-22 2011-01-12 北京搜狗科技发展有限公司 一种针对客户端用户群进行聚类的方法和系统
CN102339389B (zh) * 2011-09-14 2013-05-29 清华大学 一种基于密度的参数优化单分类支持向量机故障检测方法
CN104751263A (zh) * 2013-12-31 2015-07-01 南京理工大学常熟研究院有限公司 面向计量检定业务的客户等级智能分类方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020078044A1 (en) * 2000-12-19 2002-06-20 Jong-Cheol Song System for automatically classifying documents by category learning using a genetic algorithm and a term cluster and method thereof
CN102664961A (zh) * 2012-05-04 2012-09-12 北京邮电大学 MapReduce环境下的异常检测方法
US20140122401A1 (en) * 2012-10-29 2014-05-01 Sas Institute Inc. System and Method for Combining Segmentation Data
CN103559630A (zh) * 2013-10-31 2014-02-05 华南师范大学 一种基于客户属性及行为特征分析的客户细分方法

Also Published As

Publication number Publication date
CN107194815B (zh) 2018-06-22
CN107194815A (zh) 2017-09-22

Similar Documents

Publication Publication Date Title
US20240070214A1 (en) Image searching method and apparatus
US20200151155A1 (en) Classifying an unmanaged dataset
Naim et al. SWIFT—scalable clustering for automated identification of rare cell populations in large, high‐dimensional flow cytometry datasets, Part 1: Algorithm design
WO2018090643A1 (zh) 客户分类方法、电子装置及存储介质
WO2019214245A1 (zh) 一种信息推送方法、装置、终端设备及存储介质
US8799275B2 (en) Information retrieval based on semantic patterns of queries
Xu et al. Characteristic analysis of Otsu threshold and its applications
CN106156791B (zh) 业务数据分类方法和装置
US20160012061A1 (en) Similar document detection and electronic discovery
WO2013129548A1 (ja) 文書分別システム及び文書分別方法並びに文書分別プログラム
WO2015015826A1 (ja) 文書分別システム及び文書分別方法並びに文書分別プログラム
US20140032611A1 (en) Relationship discovery in business analytics
WO2018059298A1 (zh) 模式挖掘方法、高效用项集挖掘方法及相关设备
CN110766486A (zh) 确定物品类目的方法和装置
WO2021111540A1 (ja) 評価方法、評価プログラム、および情報処理装置
CN106815253B (zh) 一种基于混合数据类型数据的挖掘方法
WO2017203672A1 (ja) アイテム推奨方法、アイテム推奨プログラムおよびアイテム推奨装置
WO2021003803A1 (zh) 数据处理方法、装置、存储介质及电子设备
Nanayakkara et al. Evaluation measure for group-based record linkage
JP2017111654A (ja) 画像処理システム、画像類似判定方法および画像類似判定プログラム
US20080088642A1 (en) Image management through lexical representations
US20190122232A1 (en) Systems and methods for improving classifier accuracy
JP4546989B2 (ja) 文書データ提供装置、文書データ提供システム、文書データ提供方法及び文書データを提供するプログラムを記録した記録媒体
Goindani et al. Employer industry classification using job postings
WO2012044305A1 (en) Identification of events of interest

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17870912

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM XXXX DATED 11.09.2019)

122 Ep: pct application non-entry in european phase

Ref document number: 17870912

Country of ref document: EP

Kind code of ref document: A1