CN108664653A - A kind of Medical Consumption client's automatic classification method based on K-means - Google Patents
A kind of Medical Consumption client's automatic classification method based on K-means Download PDFInfo
- Publication number
- CN108664653A CN108664653A CN201810477477.7A CN201810477477A CN108664653A CN 108664653 A CN108664653 A CN 108664653A CN 201810477477 A CN201810477477 A CN 201810477477A CN 108664653 A CN108664653 A CN 108664653A
- Authority
- CN
- China
- Prior art keywords
- client
- cluster
- customer
- year
- method based
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
Abstract
The invention discloses a kind of Medical Consumption client's automatic classification method based on K means, including step:S1, the statistics extraction data from customer database;S2, according to CRM (Customer Relationship Management) method, according to client year consuming frequency and the year consumption total amount of client build Customer Classifying Model;S3, using the Customer Classifying Model of step S2, obtain the feature vector of each client;S4, the feature vector acquired in step S3 is clustered using machine learning K means algorithms;S5, using the cluster result obtained by step S4 as client segmentation as a result, to realize classification to client.The present invention uses the related algorithm of machine learning, according to the consuming frequency of client and consumption total amount, is classified automatically to client, the invention enables including that criteria for classification is completed by machine completely to the classification of client, saves manpower and materials.
Description
Technical field
The present invention relates to the technical fields of machine learning and data mining, refer in particular to a kind of medical treatment based on K-means
Customer automatic classification method.
Background technology
With the high speed development and maturation of computer storage capacity and computing capability, the storage of mass data and it is calculated as
It may.Meanwhile by machine learning algorithm data are handled in the way of it is also more and more common.By machine learning algorithm,
Existing mass data is analyzed, the hiding relationship in data can be excavated, to have using these relational designs
Body scheme improves the efficiency and income of every field related industry, wherein include just the knot of medical market and data mining
It closes.
There is a large amount of customer data in medical store, classified to client using these data, to different classes of visitor
Family provides and more preferably targetedly services, and medical store can be given to bring more incomes.But direct labor to a large amount of
It is unpractical that client, which carries out classification, often be there are problems that in the presence of such:1) different people classifies to client
During there are the influences of subjective factor;2) subjective factor for formulating people is equally received to the formulation of the criteria for classification of client
It influences;3) manually classification is carried out to customer information to need to consume a large amount of time and efforts, cause the waste of medical store resource.
Invention content
It is an object of the invention to overcome the objective condition of the manual sort of the prior art insufficient, it is proposed that one kind being based on K-
Medical Consumption client's automatic classification method of means divides Medical Consumption client under line using data mining technology automatically
Class specifically uses the related algorithm of machine learning, according to the consuming frequency of client and consumption total amount, is carried out to client automatic
Classification.The invention enables including that criteria for classification is completed by machine completely to the classification of client, manpower and materials are saved.
To achieve the above object, technical solution provided by the present invention is:A kind of Medical Consumption client based on K-means
Automatic classification method includes the following steps:
S1, the statistics extraction data from customer database;
S2, according to CRM (Customer Relationship Management) method, according to the year consuming frequency of client
Customer Classifying Model is built with the year consumption total amount of client;
S3, using the Customer Classifying Model of step S2, obtain the feature vector of each client;
S4, the feature vector acquired in step S3 is clustered using machine learning K-means algorithms;
S5, using the cluster result obtained by step S4 as client segmentation as a result, to realize classification to client.
In step sl, from the relevant database of client, the essential information of client is obtained by inquiry and passes through polymerization
Function counts the year consuming frequency and year spending amount of each client, in this, as the input data of automatic classification method.
In step s 2, the use of the year consuming frequency of client is abscissa, to consume total amount year as ordinate, structure
Go out customer disaggregated model.
In step s 2, the CRM methods are a kind of new type management sides for improving relationship between enterprise and client
Method, wherein in customer relation management a important link is exactly customer segmentation, and the customer segmentation is substantially a kind of by one
The method that big customers or Consumer groups are divided into multiple classification groups belongs to a classification group in these groups
Client or consumer characteristic each other it is similar.
In step s3, using the Customer Classifying Model established in step S2, the feature vector of each client is obtained, herein
Input data include two dimensions of Customer Classifying Model just, therefore directly with the data of each client in step S1
Input the feature vector as the client.
In step s 4, using the client characteristics vector of step S3 as the input of K-means algorithms, K-means is executed
Algorithm classifies to client, and the cluster process of wherein K-means algorithms is divided into following three steps:
S41, initialization cluster centre
It is initial poly- as three of this cluster process using this k sample by finding k sample at random in data set
Class center;
S42, assigning process
In step s 4, using the client characteristics vector of step S3 as the input of K-means algorithms, K-means is executed
Algorithm classifies to client, and the cluster process of wherein K-means algorithms is divided into following three steps:
S41, initialization cluster centre
It is initial poly- as three of this cluster process using this k sample by finding k sample at random in data set
Class center;
S42, assigning process
If t indicates iterations,For i-th of cluster of the t times iteration,In the ith cluster for indicating the t times iteration
The heart,Indicate j-th of cluster centre of the t times iteration, xpIndicate that assigned data point, k are required clusters number,
There is following formula:
The formula reflects foundation of K-means algorithms during cluster, that is, distributes data point so that is assigned point
xpDistance to the cluster centre of affiliated cluster is minimum;
S43, renewal process
IfFor the ith cluster center of the t+1 times iteration,For i-th of cluster of the t times iteration, xjIt indicatesIn
Jth data point;To the cluster obtained in assigning process, cluster centre is recalculated, there is following formula:
The formula reflects way of the K-means algorithms when updating cluster centre, that is, calculates all data points in cluster
Mean value, the successively cluster centre brand new as this;
Wherein, the cluster (cluster) is some grouping obtained after clustering algorithm operation, the cluster centre
(cluster center) is the central point of the obtained grouping of clustering algorithm.
In step s 5, applying step S4 obtains classification results of the cluster result as client, and specific class categories are diligent
Three kinds of thrifty type client, economical client and petty bourgeoisie type client.
Compared with prior art, the present invention having the following advantages that and advantageous effect:
1, the present invention classifies to client according to CRM methods, uses frequency and the total amount of doing shopping in year of doing shopping in the year of client
It is used as the disaggregated model of client, the result more preferably science of classification and has reference value.
2, present invention uses the automatic classification that machine learning algorithm completes client, the full-automation of client segmentation is realized,
Process need not be participated in manually, to save a large amount of financial resources and material resources, while also save a large amount of time.
Description of the drawings
Fig. 1 is the processing step flow chart of the method for the present invention.
Specific implementation mode
The present invention is further explained in the light of specific embodiments.
As shown in Figure 1, Medical Consumption client's automatic classification method based on K-means that the present embodiment is provided, mainly
Classified automatically to Medical Consumption client under line using data mining technology, specifically use the related algorithm of machine learning,
According to the consuming frequency of client and consumption total amount, classified automatically to client;It includes the following steps:
Step S1, customer data is extracted from database
The step for mainly by directly executing sql command or will be objective by the methods of database API operating databases
The essential information at family and the statistical information including consuming total amount and year consuming frequency in year are extracted from database,
Input data as this sorting technique.
Step S2, it according to CRM (Customer Relationship Management) method, is consumed using the year of client
The year of frequency and client consume two dimensions of total amount, build Customer Classifying Model;
The step for mainly using client year consuming frequency and year consumption total amount establish Customer Classifying Model, specifically
For:Year consuming frequency using client is abscissa, consumes total amount year as ordinate.One is described by the two dimensions
A client, and as the standard of client segmentation.Wherein, CRM methods are a kind of for improving relationship between enterprise and client
New type management method, wherein in customer relation management a important link is exactly customer segmentation, and the customer segmentation is substantially
A method of a big customers or Consumer groups are divided into multiple classification groups, one is belonged in these groups
The client of a classification group or the characteristic each other of consumer are similar.
Step S3, client characteristics are obtained
The step for mainly according to the disaggregated model of step S2, to each client, the customer data that is obtained from step S1
Middle extraction client characteristics vector, client characteristics vector length are 2, vectorial two dimensions be respectively client year consuming frequency and
The year of client consumes total amount.
Step S4, the feature of extraction is clustered
The step for mainly the client characteristics obtained in step S3 are clustered using K-means clustering methods, will
Classification results of the result of cluster as client.
The packet count for needing to obtain specified first, i.e. number of clusters mesh k.
According to the grouping number k determined, randomly select k sample point as k in client characteristics vector set
The cluster centre of cluster.
Remaining sample point is concentrated to data, according to formula
In formula, t indicates iterations,For i-th of cluster of the t times iteration,Indicate the t times iteration i-th gathers
Class center,Indicate j-th of cluster centre of the t times iteration, xpIndicate that assigned data point, k are required cluster numbers
Mesh.
Each sample point is assigned in target cluster, and according to formula
Each cluster centre is updated, is no longer changed until cluster centre is stable.Wherein,For t+1
The ith cluster center of secondary iteration,For i-th of cluster of the t times iteration, xjIt indicatesIn jth data point.
Step S5, user's classification results are obtained
Using the obtained cluster results of step S4 visitor is obtained in such a way that each cluster is as a client segmentation
The classification results at family, specific class categories are three kinds of hardworking and thrifty type client, economical client and petty bourgeoisie type client, it is thus achieved that doctor
Treat the automation of store user classification.
Embodiment described above is only the preferred embodiments of the invention, and but not intended to limit the scope of the present invention, therefore
Change made by all shapes according to the present invention, principle, should all cover within the scope of the present invention.
Claims (7)
1. a kind of Medical Consumption client's automatic classification method based on K-means, which is characterized in that include the following steps:
S1, the statistics extraction data from customer database;
S2, according to CRM methods, according to client year consuming frequency and the year consumption total amount of client build Customer Classifying Model;
S3, using the Customer Classifying Model of step S2, obtain the feature vector of each client;
S4, the feature vector acquired in step S3 is clustered using machine learning K-means algorithms;
S5, using the cluster result obtained by step S4 as client segmentation as a result, to realize classification to client.
2. a kind of Medical Consumption client's automatic classification method based on K-means according to claim 1, feature exist
In:In step sl, from the relevant database of client, the essential information of client is obtained by inquiry and is united by aggregate function
The year consuming frequency and year spending amount for counting out each client, in this, as the input data of automatic classification method.
3. a kind of Medical Consumption client's automatic classification method based on K-means according to claim 1, feature exist
In:In step s 2, the use of the year consuming frequency of client is abscissa, to consume total amount year as ordinate, constructs visitor
Family Customer Classifying Model.
4. a kind of Medical Consumption client's automatic classification method based on K-means according to claim 1, feature exist
In:In step s 2, the CRM methods are a kind of new type management methods for improving relationship between enterprise and client,
An important link in middle customer relation management is exactly customer segmentation, and the customer segmentation is substantially a kind of by a big visitor
The method that family group or Consumer groups are divided into multiple classification groups, the client of a classification group is belonged in these groups
Or the characteristic each other of consumer is similar.
5. a kind of Medical Consumption client's automatic classification method based on K-means according to claim 1, feature exist
In:In step s3, using the Customer Classifying Model established in step S2, the feature vector of each client is obtained, herein defeated
Enter two dimensions that data include Customer Classifying Model just, therefore is directly inputted with the data of each client in step S1
Feature vector as the client.
6. a kind of Medical Consumption client's automatic classification method based on K-means according to claim 1, feature exist
In:In step s 4, using the client characteristics vector of step S3 as the input of K-means algorithms, K-means algorithms pair are executed
Client classifies, and the cluster process of wherein K-means algorithms is divided into following three steps:
S41, initialization cluster centre
By finding k sample at random in data set, in three initial clusterings using this k sample as this cluster process
The heart;
S42, assigning process
If t indicates iterations,For i-th of cluster of the t times iteration,Indicate the ith cluster center of the t times iteration,Indicate j-th of cluster centre of the t times iteration, xpIndicate that assigned data point, k are required clusters number, just like
Lower formula:
The formula reflects foundation of K-means algorithms during cluster, that is, distributes data point so that is assigned point xpIt arrives
The distance of the cluster centre of affiliated cluster is minimum;
S43, renewal process
IfFor the ith cluster center of the t+1 times iteration,For i-th of cluster of the t times iteration, xjIt indicatesIn
Jth data point;To the cluster obtained in assigning process, cluster centre is recalculated, there is following formula:
The formula reflects way of the K-means algorithms when updating cluster centre, that is, calculates the mean value of all data points in cluster,
The cluster centre brand new as this successively;
Wherein, the cluster is some grouping obtained after clustering algorithm operation, and the cluster centre is clustering algorithm institute
The central point of obtained grouping.
7. a kind of Medical Consumption client's automatic classification method based on K-means according to claim 1, feature exist
In:In step s 5, applying step S4 obtains classification results of the cluster result as client, and specific class categories are hardworking and thrifty type visitor
Three kinds of family, economical client and petty bourgeoisie type client.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810477477.7A CN108664653A (en) | 2018-05-18 | 2018-05-18 | A kind of Medical Consumption client's automatic classification method based on K-means |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810477477.7A CN108664653A (en) | 2018-05-18 | 2018-05-18 | A kind of Medical Consumption client's automatic classification method based on K-means |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108664653A true CN108664653A (en) | 2018-10-16 |
Family
ID=63776610
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810477477.7A Pending CN108664653A (en) | 2018-05-18 | 2018-05-18 | A kind of Medical Consumption client's automatic classification method based on K-means |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108664653A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111222686A (en) * | 2019-11-21 | 2020-06-02 | 施甸县保施高速公路投资开发有限公司 | Method for optimizing state of service area of highway |
CN111400375A (en) * | 2020-03-19 | 2020-07-10 | 畅捷通信息技术股份有限公司 | Business opportunity mining method and device based on financial service data |
CN112307111A (en) * | 2020-11-02 | 2021-02-02 | 北京深演智能科技股份有限公司 | Data display method and device |
CN113256348A (en) * | 2021-06-23 | 2021-08-13 | 北京新赛点体育投资股份有限公司 | User account data processing method based on big data statistics |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016187437A1 (en) * | 2015-05-19 | 2016-11-24 | 24/7 Customer, Inc. | Method and system for effecting customer value based customer interaction management |
CN106503438A (en) * | 2016-10-20 | 2017-03-15 | 上海科瓴医疗科技有限公司 | A kind of H RFM user modeling method and system for pharmacy member analysis |
CN106529968A (en) * | 2016-09-29 | 2017-03-22 | 深圳大学 | Customer classification method and system thereof based on transaction data |
CN107480187A (en) * | 2017-07-10 | 2017-12-15 | 北京京东尚科信息技术有限公司 | User's value category method and apparatus based on cluster analysis |
CN107992883A (en) * | 2017-11-22 | 2018-05-04 | 福建省计量科学研究院 | A kind of metering industry customer's divided method based on CRFM models |
-
2018
- 2018-05-18 CN CN201810477477.7A patent/CN108664653A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016187437A1 (en) * | 2015-05-19 | 2016-11-24 | 24/7 Customer, Inc. | Method and system for effecting customer value based customer interaction management |
CN106529968A (en) * | 2016-09-29 | 2017-03-22 | 深圳大学 | Customer classification method and system thereof based on transaction data |
CN106503438A (en) * | 2016-10-20 | 2017-03-15 | 上海科瓴医疗科技有限公司 | A kind of H RFM user modeling method and system for pharmacy member analysis |
CN107480187A (en) * | 2017-07-10 | 2017-12-15 | 北京京东尚科信息技术有限公司 | User's value category method and apparatus based on cluster analysis |
CN107992883A (en) * | 2017-11-22 | 2018-05-04 | 福建省计量科学研究院 | A kind of metering industry customer's divided method based on CRFM models |
Non-Patent Citations (1)
Title |
---|
谭军: "基于CRM数据挖掘的电信客户细分模型分析与设计", 《万方》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111222686A (en) * | 2019-11-21 | 2020-06-02 | 施甸县保施高速公路投资开发有限公司 | Method for optimizing state of service area of highway |
CN111400375A (en) * | 2020-03-19 | 2020-07-10 | 畅捷通信息技术股份有限公司 | Business opportunity mining method and device based on financial service data |
CN112307111A (en) * | 2020-11-02 | 2021-02-02 | 北京深演智能科技股份有限公司 | Data display method and device |
CN113256348A (en) * | 2021-06-23 | 2021-08-13 | 北京新赛点体育投资股份有限公司 | User account data processing method based on big data statistics |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Syakur et al. | Integration k-means clustering method and elbow method for identification of the best customer profile cluster | |
CN108664653A (en) | A kind of Medical Consumption client's automatic classification method based on K-means | |
CN102663100B (en) | Two-stage hybrid particle swarm optimization clustering method | |
US9536201B2 (en) | Identifying associations in data and performing data analysis using a normalized highest mutual information score | |
CN105095522B (en) | Relation table set external key recognition methods based on nearest neighbor search | |
CN103136355B (en) | A kind of Text Clustering Method based on automatic threshold fish-swarm algorithm | |
CN107832456B (en) | Parallel KNN text classification method based on critical value data division | |
CN108734216A (en) | Classification of power customers method, apparatus and storage medium based on load curve form | |
CN111062425B (en) | Unbalanced data set processing method based on C-K-SMOTE algorithm | |
CN109117872A (en) | A kind of user power utilization behavior analysis method based on automatic Optimal Clustering | |
CN110502691A (en) | Product method for pushing, device and readable storage medium storing program for executing based on client segmentation | |
Yu et al. | Quantization-based clustering algorithm | |
Peng et al. | The health care fraud detection using the pharmacopoeia spectrum tree and neural network analytic contribution hierarchy process | |
CN106204267A (en) | A kind of based on improving k means and the customer segmentation system of neural network clustering | |
Aggelis et al. | Customer clustering using rfm analysis | |
CN107729377A (en) | Customer classification method and system based on data mining | |
CN115641177A (en) | Prevent second and kill prejudgement system based on machine learning | |
CN107093005A (en) | The method that tax handling service hall's automatic classification is realized based on big data mining algorithm | |
Baswade et al. | A comparative study of k-means and weighted k-means for clustering | |
CN109583712B (en) | Data index analysis method and device and storage medium | |
Chen et al. | Efficient clustering method based on rough set and genetic algorithm | |
CN108388911A (en) | A kind of mobile subscriber's Dynamic Fuzzy Clustering Algorithm method towards mixed attributes | |
Zheng | Application of silence customer segmentation in securities industry based on fuzzy cluster algorithm | |
Li et al. | K-LRFMD: method of customer value segmentation in shared transportation filed based on improved K-means algorithm | |
Féraud et al. | The orange customer analysis platform |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20181016 |