CN108664653A - A kind of Medical Consumption client's automatic classification method based on K-means - Google Patents

A kind of Medical Consumption client's automatic classification method based on K-means Download PDF

Info

Publication number
CN108664653A
CN108664653A CN201810477477.7A CN201810477477A CN108664653A CN 108664653 A CN108664653 A CN 108664653A CN 201810477477 A CN201810477477 A CN 201810477477A CN 108664653 A CN108664653 A CN 108664653A
Authority
CN
China
Prior art keywords
client
cluster
customer
year
method based
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810477477.7A
Other languages
Chinese (zh)
Inventor
古万荣
施玉健
毛宜军
李海良
朱韬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Top & Amp Guangzhou (gene) Precision Medical Technology Co Ltd
Original Assignee
Top & Amp Guangzhou (gene) Precision Medical Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Top & Amp Guangzhou (gene) Precision Medical Technology Co Ltd filed Critical Top & Amp Guangzhou (gene) Precision Medical Technology Co Ltd
Priority to CN201810477477.7A priority Critical patent/CN108664653A/en
Publication of CN108664653A publication Critical patent/CN108664653A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Abstract

The invention discloses a kind of Medical Consumption client's automatic classification method based on K means, including step:S1, the statistics extraction data from customer database;S2, according to CRM (Customer Relationship Management) method, according to client year consuming frequency and the year consumption total amount of client build Customer Classifying Model;S3, using the Customer Classifying Model of step S2, obtain the feature vector of each client;S4, the feature vector acquired in step S3 is clustered using machine learning K means algorithms;S5, using the cluster result obtained by step S4 as client segmentation as a result, to realize classification to client.The present invention uses the related algorithm of machine learning, according to the consuming frequency of client and consumption total amount, is classified automatically to client, the invention enables including that criteria for classification is completed by machine completely to the classification of client, saves manpower and materials.

Description

A kind of Medical Consumption client's automatic classification method based on K-means
Technical field
The present invention relates to the technical fields of machine learning and data mining, refer in particular to a kind of medical treatment based on K-means Customer automatic classification method.
Background technology
With the high speed development and maturation of computer storage capacity and computing capability, the storage of mass data and it is calculated as It may.Meanwhile by machine learning algorithm data are handled in the way of it is also more and more common.By machine learning algorithm, Existing mass data is analyzed, the hiding relationship in data can be excavated, to have using these relational designs Body scheme improves the efficiency and income of every field related industry, wherein include just the knot of medical market and data mining It closes.
There is a large amount of customer data in medical store, classified to client using these data, to different classes of visitor Family provides and more preferably targetedly services, and medical store can be given to bring more incomes.But direct labor to a large amount of It is unpractical that client, which carries out classification, often be there are problems that in the presence of such:1) different people classifies to client During there are the influences of subjective factor;2) subjective factor for formulating people is equally received to the formulation of the criteria for classification of client It influences;3) manually classification is carried out to customer information to need to consume a large amount of time and efforts, cause the waste of medical store resource.
Invention content
It is an object of the invention to overcome the objective condition of the manual sort of the prior art insufficient, it is proposed that one kind being based on K- Medical Consumption client's automatic classification method of means divides Medical Consumption client under line using data mining technology automatically Class specifically uses the related algorithm of machine learning, according to the consuming frequency of client and consumption total amount, is carried out to client automatic Classification.The invention enables including that criteria for classification is completed by machine completely to the classification of client, manpower and materials are saved.
To achieve the above object, technical solution provided by the present invention is:A kind of Medical Consumption client based on K-means Automatic classification method includes the following steps:
S1, the statistics extraction data from customer database;
S2, according to CRM (Customer Relationship Management) method, according to the year consuming frequency of client Customer Classifying Model is built with the year consumption total amount of client;
S3, using the Customer Classifying Model of step S2, obtain the feature vector of each client;
S4, the feature vector acquired in step S3 is clustered using machine learning K-means algorithms;
S5, using the cluster result obtained by step S4 as client segmentation as a result, to realize classification to client.
In step sl, from the relevant database of client, the essential information of client is obtained by inquiry and passes through polymerization Function counts the year consuming frequency and year spending amount of each client, in this, as the input data of automatic classification method.
In step s 2, the use of the year consuming frequency of client is abscissa, to consume total amount year as ordinate, structure Go out customer disaggregated model.
In step s 2, the CRM methods are a kind of new type management sides for improving relationship between enterprise and client Method, wherein in customer relation management a important link is exactly customer segmentation, and the customer segmentation is substantially a kind of by one The method that big customers or Consumer groups are divided into multiple classification groups belongs to a classification group in these groups Client or consumer characteristic each other it is similar.
In step s3, using the Customer Classifying Model established in step S2, the feature vector of each client is obtained, herein Input data include two dimensions of Customer Classifying Model just, therefore directly with the data of each client in step S1 Input the feature vector as the client.
In step s 4, using the client characteristics vector of step S3 as the input of K-means algorithms, K-means is executed Algorithm classifies to client, and the cluster process of wherein K-means algorithms is divided into following three steps:
S41, initialization cluster centre
It is initial poly- as three of this cluster process using this k sample by finding k sample at random in data set Class center;
S42, assigning process
In step s 4, using the client characteristics vector of step S3 as the input of K-means algorithms, K-means is executed Algorithm classifies to client, and the cluster process of wherein K-means algorithms is divided into following three steps:
S41, initialization cluster centre
It is initial poly- as three of this cluster process using this k sample by finding k sample at random in data set Class center;
S42, assigning process
If t indicates iterations,For i-th of cluster of the t times iteration,In the ith cluster for indicating the t times iteration The heart,Indicate j-th of cluster centre of the t times iteration, xpIndicate that assigned data point, k are required clusters number, There is following formula:
The formula reflects foundation of K-means algorithms during cluster, that is, distributes data point so that is assigned point xpDistance to the cluster centre of affiliated cluster is minimum;
S43, renewal process
IfFor the ith cluster center of the t+1 times iteration,For i-th of cluster of the t times iteration, xjIt indicatesIn Jth data point;To the cluster obtained in assigning process, cluster centre is recalculated, there is following formula:
The formula reflects way of the K-means algorithms when updating cluster centre, that is, calculates all data points in cluster Mean value, the successively cluster centre brand new as this;
Wherein, the cluster (cluster) is some grouping obtained after clustering algorithm operation, the cluster centre (cluster center) is the central point of the obtained grouping of clustering algorithm.
In step s 5, applying step S4 obtains classification results of the cluster result as client, and specific class categories are diligent Three kinds of thrifty type client, economical client and petty bourgeoisie type client.
Compared with prior art, the present invention having the following advantages that and advantageous effect:
1, the present invention classifies to client according to CRM methods, uses frequency and the total amount of doing shopping in year of doing shopping in the year of client It is used as the disaggregated model of client, the result more preferably science of classification and has reference value.
2, present invention uses the automatic classification that machine learning algorithm completes client, the full-automation of client segmentation is realized, Process need not be participated in manually, to save a large amount of financial resources and material resources, while also save a large amount of time.
Description of the drawings
Fig. 1 is the processing step flow chart of the method for the present invention.
Specific implementation mode
The present invention is further explained in the light of specific embodiments.
As shown in Figure 1, Medical Consumption client's automatic classification method based on K-means that the present embodiment is provided, mainly Classified automatically to Medical Consumption client under line using data mining technology, specifically use the related algorithm of machine learning, According to the consuming frequency of client and consumption total amount, classified automatically to client;It includes the following steps:
Step S1, customer data is extracted from database
The step for mainly by directly executing sql command or will be objective by the methods of database API operating databases The essential information at family and the statistical information including consuming total amount and year consuming frequency in year are extracted from database, Input data as this sorting technique.
Step S2, it according to CRM (Customer Relationship Management) method, is consumed using the year of client The year of frequency and client consume two dimensions of total amount, build Customer Classifying Model;
The step for mainly using client year consuming frequency and year consumption total amount establish Customer Classifying Model, specifically For:Year consuming frequency using client is abscissa, consumes total amount year as ordinate.One is described by the two dimensions A client, and as the standard of client segmentation.Wherein, CRM methods are a kind of for improving relationship between enterprise and client New type management method, wherein in customer relation management a important link is exactly customer segmentation, and the customer segmentation is substantially A method of a big customers or Consumer groups are divided into multiple classification groups, one is belonged in these groups The client of a classification group or the characteristic each other of consumer are similar.
Step S3, client characteristics are obtained
The step for mainly according to the disaggregated model of step S2, to each client, the customer data that is obtained from step S1 Middle extraction client characteristics vector, client characteristics vector length are 2, vectorial two dimensions be respectively client year consuming frequency and The year of client consumes total amount.
Step S4, the feature of extraction is clustered
The step for mainly the client characteristics obtained in step S3 are clustered using K-means clustering methods, will Classification results of the result of cluster as client.
The packet count for needing to obtain specified first, i.e. number of clusters mesh k.
According to the grouping number k determined, randomly select k sample point as k in client characteristics vector set The cluster centre of cluster.
Remaining sample point is concentrated to data, according to formula
In formula, t indicates iterations,For i-th of cluster of the t times iteration,Indicate the t times iteration i-th gathers Class center,Indicate j-th of cluster centre of the t times iteration, xpIndicate that assigned data point, k are required cluster numbers Mesh.
Each sample point is assigned in target cluster, and according to formula
Each cluster centre is updated, is no longer changed until cluster centre is stable.Wherein,For t+1 The ith cluster center of secondary iteration,For i-th of cluster of the t times iteration, xjIt indicatesIn jth data point.
Step S5, user's classification results are obtained
Using the obtained cluster results of step S4 visitor is obtained in such a way that each cluster is as a client segmentation The classification results at family, specific class categories are three kinds of hardworking and thrifty type client, economical client and petty bourgeoisie type client, it is thus achieved that doctor Treat the automation of store user classification.
Embodiment described above is only the preferred embodiments of the invention, and but not intended to limit the scope of the present invention, therefore Change made by all shapes according to the present invention, principle, should all cover within the scope of the present invention.

Claims (7)

1. a kind of Medical Consumption client's automatic classification method based on K-means, which is characterized in that include the following steps:
S1, the statistics extraction data from customer database;
S2, according to CRM methods, according to client year consuming frequency and the year consumption total amount of client build Customer Classifying Model;
S3, using the Customer Classifying Model of step S2, obtain the feature vector of each client;
S4, the feature vector acquired in step S3 is clustered using machine learning K-means algorithms;
S5, using the cluster result obtained by step S4 as client segmentation as a result, to realize classification to client.
2. a kind of Medical Consumption client's automatic classification method based on K-means according to claim 1, feature exist In:In step sl, from the relevant database of client, the essential information of client is obtained by inquiry and is united by aggregate function The year consuming frequency and year spending amount for counting out each client, in this, as the input data of automatic classification method.
3. a kind of Medical Consumption client's automatic classification method based on K-means according to claim 1, feature exist In:In step s 2, the use of the year consuming frequency of client is abscissa, to consume total amount year as ordinate, constructs visitor Family Customer Classifying Model.
4. a kind of Medical Consumption client's automatic classification method based on K-means according to claim 1, feature exist In:In step s 2, the CRM methods are a kind of new type management methods for improving relationship between enterprise and client, An important link in middle customer relation management is exactly customer segmentation, and the customer segmentation is substantially a kind of by a big visitor The method that family group or Consumer groups are divided into multiple classification groups, the client of a classification group is belonged in these groups Or the characteristic each other of consumer is similar.
5. a kind of Medical Consumption client's automatic classification method based on K-means according to claim 1, feature exist In:In step s3, using the Customer Classifying Model established in step S2, the feature vector of each client is obtained, herein defeated Enter two dimensions that data include Customer Classifying Model just, therefore is directly inputted with the data of each client in step S1 Feature vector as the client.
6. a kind of Medical Consumption client's automatic classification method based on K-means according to claim 1, feature exist In:In step s 4, using the client characteristics vector of step S3 as the input of K-means algorithms, K-means algorithms pair are executed Client classifies, and the cluster process of wherein K-means algorithms is divided into following three steps:
S41, initialization cluster centre
By finding k sample at random in data set, in three initial clusterings using this k sample as this cluster process The heart;
S42, assigning process
If t indicates iterations,For i-th of cluster of the t times iteration,Indicate the ith cluster center of the t times iteration,Indicate j-th of cluster centre of the t times iteration, xpIndicate that assigned data point, k are required clusters number, just like Lower formula:
The formula reflects foundation of K-means algorithms during cluster, that is, distributes data point so that is assigned point xpIt arrives The distance of the cluster centre of affiliated cluster is minimum;
S43, renewal process
IfFor the ith cluster center of the t+1 times iteration,For i-th of cluster of the t times iteration, xjIt indicatesIn Jth data point;To the cluster obtained in assigning process, cluster centre is recalculated, there is following formula:
The formula reflects way of the K-means algorithms when updating cluster centre, that is, calculates the mean value of all data points in cluster, The cluster centre brand new as this successively;
Wherein, the cluster is some grouping obtained after clustering algorithm operation, and the cluster centre is clustering algorithm institute The central point of obtained grouping.
7. a kind of Medical Consumption client's automatic classification method based on K-means according to claim 1, feature exist In:In step s 5, applying step S4 obtains classification results of the cluster result as client, and specific class categories are hardworking and thrifty type visitor Three kinds of family, economical client and petty bourgeoisie type client.
CN201810477477.7A 2018-05-18 2018-05-18 A kind of Medical Consumption client's automatic classification method based on K-means Pending CN108664653A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810477477.7A CN108664653A (en) 2018-05-18 2018-05-18 A kind of Medical Consumption client's automatic classification method based on K-means

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810477477.7A CN108664653A (en) 2018-05-18 2018-05-18 A kind of Medical Consumption client's automatic classification method based on K-means

Publications (1)

Publication Number Publication Date
CN108664653A true CN108664653A (en) 2018-10-16

Family

ID=63776610

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810477477.7A Pending CN108664653A (en) 2018-05-18 2018-05-18 A kind of Medical Consumption client's automatic classification method based on K-means

Country Status (1)

Country Link
CN (1) CN108664653A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111222686A (en) * 2019-11-21 2020-06-02 施甸县保施高速公路投资开发有限公司 Method for optimizing state of service area of highway
CN111400375A (en) * 2020-03-19 2020-07-10 畅捷通信息技术股份有限公司 Business opportunity mining method and device based on financial service data
CN112307111A (en) * 2020-11-02 2021-02-02 北京深演智能科技股份有限公司 Data display method and device
CN113256348A (en) * 2021-06-23 2021-08-13 北京新赛点体育投资股份有限公司 User account data processing method based on big data statistics

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016187437A1 (en) * 2015-05-19 2016-11-24 24/7 Customer, Inc. Method and system for effecting customer value based customer interaction management
CN106503438A (en) * 2016-10-20 2017-03-15 上海科瓴医疗科技有限公司 A kind of H RFM user modeling method and system for pharmacy member analysis
CN106529968A (en) * 2016-09-29 2017-03-22 深圳大学 Customer classification method and system thereof based on transaction data
CN107480187A (en) * 2017-07-10 2017-12-15 北京京东尚科信息技术有限公司 User's value category method and apparatus based on cluster analysis
CN107992883A (en) * 2017-11-22 2018-05-04 福建省计量科学研究院 A kind of metering industry customer's divided method based on CRFM models

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016187437A1 (en) * 2015-05-19 2016-11-24 24/7 Customer, Inc. Method and system for effecting customer value based customer interaction management
CN106529968A (en) * 2016-09-29 2017-03-22 深圳大学 Customer classification method and system thereof based on transaction data
CN106503438A (en) * 2016-10-20 2017-03-15 上海科瓴医疗科技有限公司 A kind of H RFM user modeling method and system for pharmacy member analysis
CN107480187A (en) * 2017-07-10 2017-12-15 北京京东尚科信息技术有限公司 User's value category method and apparatus based on cluster analysis
CN107992883A (en) * 2017-11-22 2018-05-04 福建省计量科学研究院 A kind of metering industry customer's divided method based on CRFM models

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
谭军: "基于CRM数据挖掘的电信客户细分模型分析与设计", 《万方》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111222686A (en) * 2019-11-21 2020-06-02 施甸县保施高速公路投资开发有限公司 Method for optimizing state of service area of highway
CN111400375A (en) * 2020-03-19 2020-07-10 畅捷通信息技术股份有限公司 Business opportunity mining method and device based on financial service data
CN112307111A (en) * 2020-11-02 2021-02-02 北京深演智能科技股份有限公司 Data display method and device
CN113256348A (en) * 2021-06-23 2021-08-13 北京新赛点体育投资股份有限公司 User account data processing method based on big data statistics

Similar Documents

Publication Publication Date Title
Syakur et al. Integration k-means clustering method and elbow method for identification of the best customer profile cluster
CN108664653A (en) A kind of Medical Consumption client's automatic classification method based on K-means
CN102663100B (en) Two-stage hybrid particle swarm optimization clustering method
US9536201B2 (en) Identifying associations in data and performing data analysis using a normalized highest mutual information score
CN105095522B (en) Relation table set external key recognition methods based on nearest neighbor search
CN103136355B (en) A kind of Text Clustering Method based on automatic threshold fish-swarm algorithm
CN107832456B (en) Parallel KNN text classification method based on critical value data division
CN108734216A (en) Classification of power customers method, apparatus and storage medium based on load curve form
CN111062425B (en) Unbalanced data set processing method based on C-K-SMOTE algorithm
CN109117872A (en) A kind of user power utilization behavior analysis method based on automatic Optimal Clustering
CN110502691A (en) Product method for pushing, device and readable storage medium storing program for executing based on client segmentation
Yu et al. Quantization-based clustering algorithm
Peng et al. The health care fraud detection using the pharmacopoeia spectrum tree and neural network analytic contribution hierarchy process
CN106204267A (en) A kind of based on improving k means and the customer segmentation system of neural network clustering
Aggelis et al. Customer clustering using rfm analysis
CN107729377A (en) Customer classification method and system based on data mining
CN115641177A (en) Prevent second and kill prejudgement system based on machine learning
CN107093005A (en) The method that tax handling service hall's automatic classification is realized based on big data mining algorithm
Baswade et al. A comparative study of k-means and weighted k-means for clustering
CN109583712B (en) Data index analysis method and device and storage medium
Chen et al. Efficient clustering method based on rough set and genetic algorithm
CN108388911A (en) A kind of mobile subscriber's Dynamic Fuzzy Clustering Algorithm method towards mixed attributes
Zheng Application of silence customer segmentation in securities industry based on fuzzy cluster algorithm
Li et al. K-LRFMD: method of customer value segmentation in shared transportation filed based on improved K-means algorithm
Féraud et al. The orange customer analysis platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20181016