CN108876444A

CN108876444A - Client's classification analysis method, device, computer equipment and storage medium

Info

Publication number: CN108876444A
Application number: CN201810545793.3A
Authority: CN
Inventors: 金戈; 徐亮; 肖京
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2018-05-25
Filing date: 2018-05-25
Publication date: 2018-11-23
Also published as: WO2019223082A1

Abstract

This application discloses client's classification analysis method, device, computer equipment and the storage medium of the application, wherein method includes：Data relevant to new client in plurality of passages are obtained respectively；The data that every channel obtains are subjected to clustering processing respectively, are obtained and the one-to-one multiple groups cluster data of plurality of passages；Multiple groups cluster data is formed into sparse matrix, and sparse matrix polishing is formed by the primary vector matrix of corresponding new client by collaborative filtering method；Primary vector matrix is subjected to similarity calculation with multiple secondary vector matrixes in preset client's category database respectively；Wherein, in client's category database include multiple client's classifications, and with the one-to-one secondary vector matrix of client's classification；Client's classification corresponding with the highest secondary vector matrix of primary vector matrix similarity is denoted as to client's classification of new client.The application can accurately obtain client's classification of new client, and then can accurately recommend the product for being suitble to new client.

Description

Client's classification analysis method, device, computer equipment and storage medium

Technical field

This application involves computer field is arrived, especially relates to a kind of client's classification analysis method, device, computer and set Standby and storage medium.

Background technique

When insurance, Investment & Financing, has relevant system and counted and calculated, for example classify to client Deng.Classified now by single channel to client in the industry, that is, obtain the single channel information of client, then carries out data point Analysis, finally the client is divided into corresponding client's classification, due to data be it is single lead to acquisition, if the number on the channel There is fraud etc. according to few or data, client's classification precision of analysis is influenced very big.

Summary of the invention

The main purpose of the application is to provide a kind of client's classification analysis method, device, computer equipment and storage medium, It is intended to improve client's classification precision of analysis.

In order to realize the goal of the invention, the application proposition proposes a kind of client's classification analysis method first, including：

Data relevant to new client in plurality of passages are obtained respectively；

The data that every channel obtains are subjected to clustering processing respectively, are obtained and the one-to-one multiple groups of the plurality of passages Cluster data；

Multiple groups cluster data is formed into sparse matrix, and is formed the sparse matrix polishing by collaborative filtering method The primary vector matrix of the corresponding new client；

The primary vector matrix is carried out with multiple secondary vector matrixes in preset client's category database respectively Similarity calculation；Wherein, in client's category database include multiple client's classifications, and with client's classification one-to-one second Vector matrix；

Will client's classification corresponding with the highest secondary vector matrix of the primary vector matrix similarity be denoted as it is described new Client's classification of client.

Further, described will customer class corresponding with the highest secondary vector matrix of the primary vector matrix similarity It is not denoted as after the other step of customer class of the new client, including：

Search product information corresponding with client's classification of the new client；

The product information is recommended into the new client.

Further, the described the step of product information is recommended into the new client, including：

Product information formation diagrammatic form is recommended into the new client, wherein the diagrammatic form includes product The character introduction of the product of information and the sales data figure of product.

Further, described that multiple groups cluster data is formed into sparse matrix, and will be described sparse by collaborative filtering method The step of matrix polishing, the primary vector matrix of the corresponding new client of formation, including：

Feature extraction is carried out to the data obtained by every channel respectively, obtains the corresponding multiple characteristics in every channel According to；

It will be extracted in the corresponding multiple characteristics in every channel with the incoherent characteristic of other feature data, As uncorrelated features data；

By the corresponding data dump of the corresponding uncorrelated features data in every channel, and corresponding to every channel, The data left carry out clustering processing respectively, obtain and the one-to-one multiple groups cluster data of the plurality of passages.

Obtain the medical data of the new client；

The insurance product information that the new client is suitble to the insurance products of purchase is chosen according to the medical data；

The customer class of the corresponding new client is filtered out in the insurance product information of the insurance products of the suitable purchase Other insurance product information recommends the new client.

Further, described that the insurance production that the new client is suitble to the insurance products of purchase is chosen according to the medical data The step of product information, including：

The medical data is subjected to feature extraction to obtain multiple medical characteristics data；

It is extracted in multiple medical characteristics data with the incoherent characteristic of other medical characteristics data as not phase Close medical characteristics data；

The corresponding medical data of the uncorrelated medical characteristics data is removed, and institute is chosen according to the medical data remained at a school or university after graduation as a faculty member State the insurance product information that new client is suitble to the insurance products of purchase.

Obtain the collage-credit data of the new client；

The credit product information that the new client is suitble to the credit product of application is chosen according to the collage-credit data；

The corresponding other credit of customer class is filtered out in the credit product information of the credit product of the suitable purchase Product information recommends the new client.

The application also provides a kind of client's classification analytical equipment, including：

Acquiring unit, for obtaining data relevant to new client in plurality of passages respectively；

Cluster cell, the data for obtaining every channel carry out clustering processing respectively, obtain and the plurality of passages One-to-one multiple groups cluster data；

Vectorization unit, for multiple groups cluster data to be formed sparse matrix, and will be described dilute by collaborative filtering method Matrix polishing is dredged, the primary vector matrix of the corresponding new client is formed；

Computing unit, for by the primary vector matrix respectively with multiple second in preset client's category database Vector matrix carries out similarity calculation；Wherein, in client's category database include multiple client's classifications, and with client's classification one One corresponding secondary vector matrix；

Selecting unit, being used for will customer class corresponding with the highest secondary vector matrix of the primary vector matrix similarity It is not denoted as client's classification of the new client.

The application also provides a kind of computer equipment, including memory and processor, and the memory is stored with computer The step of program, the processor realizes any of the above-described the method when executing the computer program.

The application also provides a kind of computer readable storage medium, is stored thereon with computer program, the computer journey The step of method described in any of the above embodiments is realized when sequence is executed by processor.

Client's classification analysis method, device, computer equipment and the storage medium of the application obtains a plurality of logical of new client The corresponding data in road are analyzed, it is ensured that and it is more accurate to everyone assessment, and also plurality of passages assessment can more comprehensively The new client of assessment, avoid generating evaluation error because single channel personal information is played tricks, accurately obtain the customer class of new client Not, and then recommend the product for being suitble to new client, recommend to resell the efficiency sold to improve.

Detailed description of the invention

Fig. 1 is the flow diagram of client's classification analysis method of one embodiment of the application；

Fig. 2 is the flow diagram of client's classification analysis method of one embodiment of the application；

Fig. 3 is the idiographic flow schematic diagram of the step S2 of above-mentioned client's classification analysis method of one embodiment of the application；

Fig. 4 is the structural schematic block diagram of client's classification analytical equipment of one embodiment of the application；

Fig. 5 is the structural schematic block diagram of client's classification analytical equipment of one embodiment of the application；

Fig. 6 is the structural schematic block diagram of the first recommendation unit of one embodiment of the application；

Fig. 7 is the structural schematic block diagram of the cluster cell of one embodiment of the application；

Fig. 8 is the structural schematic block diagram of client's classification analytical equipment of one embodiment of the application；

Fig. 9 is the structural schematic block diagram of client's classification analytical equipment of one embodiment of the application；

Figure 10 is the structural schematic block diagram of the computer equipment of one embodiment of the application.

The embodiments will be further described with reference to the accompanying drawings for realization, functional characteristics and the advantage of the application purpose.

Specific embodiment

It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not For limiting the application.

Referring to Fig.1, the embodiment of the present application proposes a kind of client's classification analysis method, including step：

S1, data relevant to new client in plurality of passages are obtained respectively；

S2, the data that every channel obtains are subjected to clustering processing respectively, are obtained one-to-one with the plurality of passages Multiple groups cluster data；

S3, multiple groups cluster data is formed to sparse matrix, and passes through collaborative filtering method for the sparse matrix polishing, shape At the primary vector matrix of the correspondence new client；

S4, by the primary vector matrix respectively with multiple secondary vector matrixes in preset client's category database into Row similarity calculation；Wherein, include multiple client's classifications in client's category database, and with client's classification one-to-one the Two vector matrixs；

S5, will client's classification corresponding with the highest secondary vector matrix of the primary vector matrix similarity be denoted as it is described Client's classification of new client.

As described in above-mentioned steps S1, above-mentioned plurality of passages refer to more than or equal to two passes, and above-mentioned channel refers to acquisition number According to channel, such as game channel, online interaction channel, consumption channel, social interactions channel etc..Obtain each channel data Major way includes：Purchase data crawl data etc. by crawler technology.In the present embodiment, four channels are selected, are specially swum Play channel, online interaction channel (wechat), consumption channel (Alipay) and social interactions channel (microblogging), game, wechat and branch Paying precious channel data can buy after the authorization by party, and microblog data can be obtained by crawler technology, Certainly, microblog data can also be bought.Above-mentioned game channel generally uses wechat game as the data in game channel, In other embodiments, the data of other game as game channel such as Netease's game, grand game also can be used.Above-mentioned game The data in channel mainly include game consumption data, playtime data etc.；The data in above-mentioned wechat channel mainly include friend Circle data (including delivering circle of friends, circle of friends number, the number of long-term interaction, the content for delivering circle of friends, its in circle of friends The content etc. that other people deliver)；The data in above-mentioned Alipay channel mainly include consumer record data, consumption locale data, consumption Categorical data etc.；The data in microblogging channel mainly include the content delivering the content of microblogging, concern record, concern microblogging and delivering Deng.In other embodiments, it can also obtain and be chosen, difference frequency out with other more channel datas, such as vehicles Traffic data, the food and drink channel data etc. of food and drink consumption, food and drink type, food and drink time etc..

As described in above-mentioned steps S2, above-mentioned clustering processing is to cluster respectively to the data in every channel, clustering algorithm Select K-means clustering algorithm：Changshu K is initialized, randomly selects initial point as mass center；Data point is referred in nearest Heart point；Recalculate mass center；First two steps are repeated until mass center is constant.Because K-means clustering algorithm is a kind of existing cluster Algorithm, specific cluster process are not repeating herein.The application uses K-means clustering algorithm, and algorithm is quickly, simply；It is right Large data sets have higher efficiency and have scalability, and time complexity is bordering on linearly, and are suitble to excavate extensive Data set.

As described in above-mentioned steps S3, above-mentioned sparse matrix refers to that the number of nonzero element in matrix is far smaller than matrix element Sum, and nonzero element distribution without rule.Data volume drop that is above-mentioned that multi-group data is subjected to clustering processing, obtaining It is low, and because there is certain difference in data class and source, after the cluster result of each group is updated to preset matrix, The distribution of each nonzero element is smaller without regular and correlation, and then forms sparse matrix, and specifically, above-mentioned data include four Group, using a group cluster result as the first row of sparse matrix, other three group clusters results respectively as sparse matrix second, Three, four row.Above-mentioned collaborative filtering method is that the vacancy in above-mentioned sparse matrix between nonzero element is carried out zero padding processing, i.e., Because the cluster result of each group of data is different, for the correspondence of data, need to obtain addition zero between nonzero element with cover Above-mentioned primary vector matrix.It include that the data characteristics in above-mentioned multiple channels makes subsequent in turn in the primary vector matrix With process, it will not be tampered with because of a certain channel data and influence whole judgement.

As described in above-mentioned steps S4, above-mentioned secondary vector matrix be sorted out in advance according to historic customer come moment of a vector Battle array.Because the type of historic customer is it has been confirmed that so each historic customer is passed through by the data that same above-mentioned channel obtains Above-mentioned primary vector matrix equally can be generated in the process of above-mentioned steps S1-S3, and only its corresponding client's classification is known , the corresponding multiple primary vector matrixes of the other historic customer of each customer class are obtained, it is then corresponding to same client's classification Multiple primary vector matrixes be averaging processing and can obtain the corresponding other secondary vector matrix of customer class.Above-mentioned historic customer The other classification of customer class classified using learning vector quantization, detailed process is as follows：

(1) initialization vector of tape label, D={ (x₁,y₁),(x₂,y₂),…,(x_m,y_m), wherein D is the sample of acquisition This collection, x, y respectively represent sample point；

(2) vector initialized marks t_i, t is the label of original vector；

(3) the n feature description of each sample：x_j=(x_j1,x_j2,…,x_jn),y_j∈ Y, j=1,2 ..., m, wherein X_ij The feature of representative sample point

(4) learning objective of LVQ is to obtain k vector：q₁,q₂,....,q_k, wherein q indicates each learning objective；

(5) to vector initialising, meet y_j=t_jSample as q_jInitial value；

(6) the optional sample x from D_j, find nearest vector q_i；If y_jWith t_iIt is equal, then q '=q_i+η(x_j-q_i), No person, q '=q_i-η(x_j-q_i)；Wherein η is parameter；

(7), renewal vector：q_i=q '；

(8) update whether threshold decision stops iteration by maximum number of iterations or vector；

(9) after obtaining vector, each vector corresponds to a panel region, and sample point is exactly the class for belonging to vector in region, into And obtain client's classification.

In the application, because above-mentioned primary vector matrix is the vector proof for the client of type to be sorted, need Calculate with preset secondary vector matrix similarity, find the highest secondary vector matrix of similarity, the calculating of similarity can be with Use Euclidean distance (Eucledian Distance), manhatton distance (Manhattan Distance), Ming Kefusi Cardinal distance is calculated from algorithm a kind of in (Minkowski distance) or cosine similarity.

As described in above-mentioned steps S5, client's classification of above-mentioned new client is marked, in order to which subsequent recommendation product gives new client Deng use.

Referring to Fig. 2, in one embodiment, it is above-mentioned will be with the highest secondary vector square of the primary vector matrix similarity After the other step S5 of customer class that the corresponding client's classification of battle array is denoted as the new client, including：

S6, product information corresponding with client's classification of the new client is searched；

S7, the product information is recommended into the new client.

As described in above-mentioned steps S6 and S7, i.e., to give new lead referral product information according to the type of new client, to improve Recommend the efficiency sold of shipping and reselling on another market.In the application, the corresponding product information of client's classification is that client's classification buys more product Product information have data record because the quantity of which several prods of any client's classification purchase is more, so The corresponding product information of new client can be readily attained.In the application, the corresponding product letter of client's classification of new client is searched The method of breath specifically includes：(1) search what the other client of the corresponding customer class of new client bought in preset database All over products record；(2) satisfactory product information then is searched in product record, this, which meets the requirements, refers to quantity purchase It is arranged according to from more to few sequence, product information of the ranking before specified ranking；(3) above-mentioned satisfactory product is believed Breath recommends above-mentioned new user, and the mode of recommendation includes Email, wechat, short message etc..

In one embodiment, the above-mentioned step S7 that the product information is recommended to the new client, including：

S71, product information formation diagrammatic form is recommended into the new client, wherein the diagrammatic form includes The character introduction of the product of product information and the sales data figure of product.

As described in above-mentioned steps S71, the sales data figure of the said goods can be that histogram, curve graph, area-graph etc. can To indicate the figure of data.Recommending may include multiple products in the product information of new client, the sale number of different products According to having differences, then the sales volume that new client can intuitively tell that product is most high after sales data is visualized, Improve the efficiency that new client checks recommendation.

It is in one embodiment, above-mentioned that multiple groups cluster data is formed into sparse matrix referring to Fig. 3, and pass through collaborative filtering The sparse matrix polishing is formed the step S2 of the primary vector matrix of the corresponding new client by method, including：

S21, feature extraction is carried out to the data obtained by every channel respectively, obtains the corresponding multiple spies in every channel Levy data；

S22, it will be extracted in the corresponding multiple characteristics in every channel with the incoherent characteristic of other feature data Out, as uncorrelated features data；

S23, by the corresponding data dump of the corresponding uncorrelated features data in every channel, and it is corresponding to every channel , the data left carry out clustering processing respectively, obtain and the one-to-one multiple groups cluster data of the plurality of passages.

As described in above-mentioned steps S21, S22 and S23, the data obtained by every channel are subjected to feature extraction respectively, Obtain corresponding to the multiple groups characteristic in each channel, each group of characteristic includes multiple；Then to each group of characteristic point Not carry out correlation analysis and should to find with the incoherent characteristic of other feature data in each group of characteristic Characteristic is denoted as uncorrelated features data, because uncorrelated to other feature data, uncorrelated features data are corresponding There may be problems for data, thus will likely data of problems dispose in advance, to improve the standard of the result of subsequent cluster True property.In the application, ReliefF algorithm is can be used in the feature extraction of data, and ReliefF algorithm is Kononeill in 1994 Relief algorithm (Relief algorithm is a kind of feature weight algorithm (Feature weighting algorithms), according to The correlation of each feature and classification assigns feature different weights, and the feature that weight is less than some threshold value will be removed) it is enterprising Algorithm obtained from row improves can handle multi-class problem for Relief algorithm, because of ReliefF algorithm It is a kind of known algorithm, therefore repeats no more the process of data characteristics extraction.

In one embodiment, above-mentioned will be corresponding with the highest secondary vector matrix of the primary vector matrix similarity Client's classification is denoted as after the other step S5 of customer class of the new client, including：

S501, the medical data for obtaining the new client；

S502, the insurance product information that the new client is suitble to the insurance products of purchase is chosen according to the medical data；

S503, in the insurance product information of the insurance products of the suitable purchase, filter out the corresponding new client's The other insurance product information of customer class recommends the new client.

As described in above-mentioned steps S501, S502 and S503, it is mainly used for insurance products and sells scene, above-mentioned medical data master Data and its electronic medical records data are used in the social security card of hospital including new client, it can be new with preliminary judgement by medical data The physical condition of client.The insurance products for being suitble to purchase are first filtered out in insurance product information library according to its physical condition, so The other insurance product information of corresponding customer class, and the guarantor that will be found out are found out in the insurance product information for being suitble to purchase again afterwards Dangerous product information recommends new client.Above-mentioned first filtered out in insurance product information library according to its physical condition is suitble to purchase Insurance products refer to, because of different physical conditions, have different insurance products that cannot buy, for example, passing through above-mentioned medical number According to new client is had determined that with a certain disease, and certain insurance products contains the insurance for the disease just, so containing The new client is not suitable for for the insurance products of the insurance of the disease, and is produced without containing the insurance of the insurance for the disease Product may then be suitble to new client to buy.In this application, the insurance product information for recommending new client can be for the client Classification sells the corresponding insurance product information of most insurance products, is also possible to sell quantity ranking for client's classification The corresponding insurance product information of insurance products etc. of (sell quantity is more, ranking is more forward) before specified ranking.

In one embodiment, above-mentioned that the insurance products that the new client is suitble to purchase are chosen according to the medical data The step S502 of insurance product information, including：

S5021, the medical data is subjected to feature extraction to obtain multiple medical characteristics data；

S5022, it extracts in multiple medical characteristics data and makees with the incoherent characteristic of other medical characteristics data For uncorrelated medical characteristics data；

S5023, the corresponding medical data of the uncorrelated medical characteristics data is removed, and according to the medical data remained at a school or university after graduation as a faculty member Choose the insurance product information that the new client is suitble to the insurance products of purchase.

As described in above-mentioned steps S5021, S5022 and S5023, there may be the related datas of insurance fraud in medical data, and Data having a certain difference property of these data generally with routine, for example, deliberately using social security card purchase drug then by medicine Object is peddled to other stores, and the frequency of swiping the card of social security card, the amount of money of swiping the card have certain rule, the drug such as bought every time Difference, but the amount of money is suitable, just will do it to swipe the card at interval of certain time and buys medicine etc..The correlation of these medical datas is lower, So can both have been extracted when by carrying out correlation analysis to its characteristic as uncorrelated medical characteristics number According to then the corresponding medical data of uncorrelated medical characteristics data is removed, judges new client using the medical data retained The insurance product information for the insurance products that can be bought.

In other embodiments, then face characteristic can also be input to pre- by the face characteristic of the new client of acquisition If different diseases anticipation models in (disease anticipation model be according to the face characteristic of a large amount of different people and each one The corresponding same disease of face feature is trained and obtains model, after inputting new face characteristic, can export the corresponding face characteristic Whether the result of the disease is suffered from) judgement, determine whether new client suffers from corresponding disease, and then provide the insurance products of adaptation Information is to new client selection etc..

In another possible embodiment, it is above-mentioned will be with the highest secondary vector matrix of the primary vector matrix similarity Corresponding client's classification is denoted as after the other step S5 of customer class of the new client, including：

S511, the collage-credit data for obtaining the new client；

S512, the credit product information that the new client is suitble to the credit product of application is chosen according to the collage-credit data；

S513, that the corresponding customer class is filtered out in the credit product information of the credit product of the suitable purchase is other Credit product information recommendation gives the new client.

As described in above-mentioned steps S511, S512 and S513, it is mainly used for the scene of financial credit, above-mentioned credit product includes Petty load, mortgage loan, loan for purchasing house etc.；Above-mentioned collage-credit data refers to credit worthiness of the new client in banking system, for example, New client does not go back credit card repeatedly on time, and credit value is lower, possibly can not carry out the mortgage loan etc. of wholesale；If new client Be used for a long time credit card, but it is each it is monthly refund on time, credit value is higher, can carry out big amount loan；If new visitor Family does not use credit card etc., and credit value is initial value, then considers the moderate credit product etc. of loan limit.In the application, First judge the credit product that new client can apply, then selects corresponding customer class other in the credit product that can apply again Credit product greatly increases the effect of recommendation, the credit product for facilitating new client accurately to select it that can apply.

Client's classification analysis method of the embodiment of the present application, the corresponding data of plurality of passages for obtaining new client are divided Analysis, it is ensured that it is more accurate to everyone assessment, and plurality of passages assessment can more comprehensively assess new client, avoid because For single channel personal information is played tricks and generates evaluation error, client's classification of new client is accurately obtained, and then recommend to be suitble to new visitor The product at family is recommended to resell the efficiency sold to improve.

Referring to Fig. 4, the embodiment of the present application proposes a kind of client's classification analytical equipment, including step：

Acquiring unit 10, for obtaining data relevant to new client in plurality of passages respectively；

Cluster cell 20, the data for obtaining every channel carry out clustering processing respectively, obtain with it is described a plurality of logical The one-to-one multiple groups cluster data in road；

Vectorization unit 30, for multiple groups cluster data to be formed sparse matrix, and will be described by collaborative filtering method Sparse matrix polishing forms the primary vector matrix of the corresponding new client；

Computing unit 40, for by the primary vector matrix respectively with multiple in preset client's category database Two vector matrixs carry out similarity calculation；Wherein, in client's category database include multiple client's classifications, and with client's classification One-to-one secondary vector matrix；

Selecting unit 50, being used for will client corresponding with the highest secondary vector matrix of the primary vector matrix similarity Classification is denoted as client's classification of the new client.

In above-mentioned acquiring unit 10, above-mentioned plurality of passages refer to more than or equal to two passes, and above-mentioned channel refers to acquisition Channel of data, such as game channel, online interaction channel, consumption channel, social interactions channel etc..Obtain each channel data Major way include：Purchase data crawl data etc. by crawler technology.In the present embodiment, four channels are selected, specially Game channel, online interaction channel (wechat), consumption channel (Alipay) and social interactions channel (microblogging), game, wechat and The channel data of Alipay can be bought after the authorization by party, and microblog data can be obtained by crawler technology It takes, certainly, microblog data can also be bought.Above-mentioned game channel generally uses wechat game as the number in game channel According in other embodiments, the data of other game as game channel such as Netease's game, grand game also can be used.On The data for stating game channel mainly include game consumption data, playtime data etc.；The data in above-mentioned wechat channel are mainly wrapped The data of circle of friends are included (including delivering circle of friends, circle of friends number, the number of long-term interaction, the content for delivering circle of friends, friend The content etc. that other people deliver in circle)；The data in above-mentioned Alipay channel mainly include consumer record data, consumption place number According to, consumption type data etc.；The data in microblogging channel mainly include delivering the content of microblogging, concern record, concern microblogging to deliver Content etc..In other embodiments, can also obtain with other more channel datas, as the vehicles choose, go out difference frequency The traffic data of rate etc., the food and drink channel data etc. of food and drink consumption, food and drink type, food and drink time etc..

In above-mentioned cluster cell 20, above-mentioned clustering processing is to cluster respectively to the data in every channel, and cluster is calculated Method selects K-means clustering algorithm：Changshu K is initialized, randomly selects initial point as mass center；Data point is referred to nearest Central point；Recalculate mass center；First two steps are repeated until mass center is constant.Because K-means clustering algorithm is a kind of existing poly- Class algorithm, specific cluster process are not repeating herein.The application uses K-means clustering algorithm, and algorithm is quickly, simply； There is higher efficiency to large data sets and there is scalability, time complexity is bordering on linearly, and is suitble to excavate big rule Mould data set.

In above-mentioned vectorization unit 30, above-mentioned sparse matrix refers to that the number of nonzero element in matrix is far smaller than matrix The sum of element, and the distribution of nonzero element is without rule.Data volume that is above-mentioned that multi-group data is subjected to clustering processing, obtaining It reduces, and because there is certain difference in data class and source, the cluster result of each group is updated to preset matrix Afterwards, the distribution of each nonzero element is smaller without regular and correlation, and then forms sparse matrix, and specifically, above-mentioned data include Four groups, using a group cluster result as the first row of sparse matrix, other three group clusters results respectively as sparse matrix Two, three, four row.Above-mentioned collaborative filtering method is to carry out in the vacancy in above-mentioned sparse matrix between nonzero element at zero padding Reason, for the correspondence of data, is needed addition zero between nonzero element to mend that is, because the cluster result of each group of data is different Position, obtains above-mentioned primary vector matrix.It include the data characteristics in above-mentioned multiple channels in the primary vector matrix, in turn, In subsequent use process, it will not be tampered with because of a certain channel data and influence whole judgement.

In above-mentioned computing unit 40, above-mentioned secondary vector matrix be sorted out in advance according to historic customer come moment of a vector Battle array.Because the type of historic customer is it has been confirmed that so each historic customer is passed through by the data that same above-mentioned channel obtains The task process that above-mentioned acquiring unit 10, cluster cell 20 and vectorization unit 30 execute equally can be generated above-mentioned first to Moment matrix, only its corresponding client's classification is known, obtains the other historic customer of each customer class corresponding multiple One vector matrix, then multiple primary vector matrixes corresponding to same client's classification, which are averaging processing, can obtain corresponding visitor The secondary vector matrix of family classification.The other classification of the customer class of above-mentioned historic customer is classified using learning vector quantization, is had Body process is as follows：

(1) initialization vector of tape label, D={ (x₁,y₁),(x₂,y₂),…,(x_m,x_m), wherein D is the sample of acquisition This collection, x, y respectively represent sample point；

(2) vector initialized marks t_i, t is the label of original vector；

(5) to vector initialising, meet y_j=t_jSample as q_jInitial value；

(7), renewal vector：q_i=q '；

In above-mentioned selecting unit 50, client's classification of above-mentioned new client is marked, in order to which subsequent recommendation product is to new visitor Family etc. uses.

Referring to Fig. 5, in one embodiment, above-mentioned client's classification analytical equipment further includes：

Searching unit 60, for searching product information corresponding with client's classification of the new client；

First recommendation unit 70, for the product information to be recommended the new client.

In above-mentioned searching unit 60 and the first recommendation unit 70, as executes and pushed away according to the type of new client to new client Product information is recommended, to improve the device for the efficiency sold of recommending to ship and resell on another market.In the application, the corresponding product information of client's classification is should Client's classification buys the product information of more product because, the quantity of any several prods of any client's classification purchase compared with It is more, there is data record, it is possible to be readily attained the corresponding product information of new client.In the application, new client is searched The method of the corresponding product information of client's classification specifically include：(1) the corresponding visitor of new client is searched in preset database The all over products record that the client of family classification bought；(2) satisfactory product information then is searched in product record, This, which meets the requirements, refers to that quantity purchase is arranged according to from more to few sequence, product information of the ranking before specified ranking；(3) Above-mentioned satisfactory product information is recommended into above-mentioned new user, the mode of recommendation includes Email, wechat, short message etc..

Referring to Fig. 6, in one embodiment, above-mentioned first recommendation unit 70, including：

Chart recommending module 71, for product information formation diagrammatic form to be recommended the new client, wherein institute State the character introduction for the product that diagrammatic form includes product information and the sales data figure of product.

In above-mentioned chart recommending module 71, the sales data figure of the said goods can be histogram, curve graph, area-graph Etc. the figure that can indicate data.Recommending may include multiple products in the product information of new client, the pin of different products It sells data to have differences, then new client can intuitively tell the sales volume of that product most after sales data is visualized It is high, improve the efficiency that new client checks recommendation.

Referring to Fig. 7, in one embodiment, above-mentioned cluster cell 20, including：

Fisrt feature extraction module 21 is obtained for carrying out feature extraction respectively to the data obtained by every channel The corresponding multiple characteristics in every channel；

First correlation analysis module 22, for by the corresponding multiple characteristics in every channel with other feature data not Relevant characteristic extracts, as uncorrelated features data；

First removes cluster module 23, for the corresponding data of the corresponding uncorrelated features data in every channel are clear It removes, and data that are corresponding to every channel, leaving carry out clustering processing respectively, obtain one-to-one with the plurality of passages Multiple groups cluster data.

It is removed in cluster module 23 in above-mentioned fisrt feature extraction module 21, the first correlation analysis module 22 and first, it will Feature extraction is carried out respectively by the data that every channel obtains, and obtains the multiple groups characteristic for corresponding to each channel, each group Characteristic includes multiple；Then correlation analysis is carried out to each group of characteristic, respectively to find each group of characteristic In with the incoherent characteristic of other feature data, and this feature data are denoted as uncorrelated features data, because with other Characteristic is uncorrelated, so there may be problems for the corresponding data of uncorrelated features data, so will likely be of problems Data are disposed in advance, to improve the accuracy of the result of subsequent cluster.In the application, the feature extraction of data be can be used ReliefF algorithm, ReliefF algorithm be Kononeill in 1994 in Relief algorithm (Relief algorithm is a kind of feature weight Algorithm (Feature weighting algorithms) assigns feature different power according to the correlation of each feature and classification Weight, weight be less than some threshold value feature will be removed) on improve obtained from algorithm, relative to Relief algorithm Speech, can handle multi-class problem, because ReliefF algorithm is a kind of known algorithm, repeats no more data characteristics and mentions The process taken.

Referring to Fig. 8, in one embodiment, above-mentioned client's classification analytical equipment further includes：

Medical data acquisition unit 501, for obtaining the medical data of the new client；

First selection unit 502, for choosing the insurance products that the new client is suitble to purchase according to the medical data Insurance product information；

Second recommendation unit 503, in the insurance product information of the insurance products of the suitable purchase, filtering out pair The other insurance product information of customer class of the new client is answered to recommend the new client.

Performed by above-mentioned medical data acquisition unit 501, the first selection unit 502 and the second recommendation unit 503 Movement is mainly used for insurance products and sells scene, and above-mentioned medical data mainly includes that new client in the social security card of hospital uses data And its electronic medical records data, it can be with the physical condition of the new client of preliminary judgement by medical data.First existed according to its physical condition The insurance products for being suitble to purchase are filtered out in insurance product information library, are then searched in the insurance product information for being suitble to purchase again The other insurance product information of customer class is corresponded to out, and the insurance product information found out is recommended into new client.It is above-mentioned according to it Physical condition, which first filters out in insurance product information library, is suitble to the insurance products of purchase to refer to, because of different physical conditions, There are different insurance products that cannot buy, for example, have determined that new client suffers from a certain disease by above-mentioned medical data, and Certain insurance products contains the insurance for the disease just, so the insurance products containing the insurance for the disease are not suitable for The new client, and the insurance products without containing the insurance for the disease may then be suitble to new client to buy.In this application, The insurance product information for recommending new client, which can be, sells the corresponding insurance production of most insurance products for client's classification Product information is also possible to sell quantity ranking (it is more selling quantity, ranking is more forward) in specified name for client's classification Corresponding insurance product information of insurance products before secondary etc..

In one embodiment, above-mentioned first selection unit 502, including：

Second feature extraction module, for the medical data to be carried out feature extraction to obtain multiple medical characteristics numbers According to；

Second correlation analysis module, for being extracted in multiple medical characteristics data and other medical characteristics data not phase The characteristic of pass is as uncorrelated medical characteristics data；

Second removes cluster module, for will the corresponding medical data removing of the uncorrelated medical characteristics data, and root The insurance product information that the new client is suitble to the insurance products of purchase is chosen according to the medical data remained at a school or university after graduation as a faculty member.

It is removed in cluster module in above-mentioned second feature extraction module, the second correlation analysis module and second, medical data In there may be the related data of insurance fraud, and these data generally with conventional having a certain difference property of data, for example, deliberately Then drug is peddled to other stores using social security card purchase drug, the frequency of swiping the card of social security card, the amount of money of swiping the card have Certain rule, the drug such as bought every time is different, but the amount of money is suitable, just will do it to swipe the card at interval of certain time and buys medicine Deng.The correlation of these medical datas is lower, so can both be mentioned when by carrying out correlation analysis to its characteristic It takes out as uncorrelated medical characteristics data, then the corresponding medical data of uncorrelated medical characteristics data is removed, is utilized The medical data retained judges the insurance product information for the insurance products that new client can buy.

In other embodiments, Diseases diagnosis unit can also be set, by obtaining the face characteristic of new client, then will Face characteristic is input in preset different disease anticipation model that (disease anticipation model is the people according to a large amount of different people Face feature and the corresponding same disease training of each face characteristic and model can export after inputting new face characteristic The corresponding face is characterized in the no result with the disease) judgement, determine whether new client suffers from corresponding disease, and then provide The insurance product information of adaptation is to new client selection etc..

Referring to Fig. 9, in another possible embodiment, above-mentioned client's classification analytical equipment further includes：

Collage-credit data acquiring unit 511, for obtaining the collage-credit data of the new client；

Second selection unit 512, for choosing the credit product that the new client is suitble to application according to the collage-credit data Credit product information；

Third recommendation unit 513, for being filtered out in the credit product information of the credit product of the suitable purchase pair Answer the other credit product information recommendation of the customer class to the new client.

That specifies in above-mentioned collage-credit data acquiring unit 511, the second selection unit 512 and third recommendation unit 513 is dynamic Make the scene for being mainly used for financial credit, above-mentioned credit product includes petty load, mortgage loan, loan for purchasing house etc.；Above-mentioned sign Letter data refers to credit worthiness of the new client in banking system, for example, new client does not go back credit card repeatedly on time, credit value compared with It is low, it possibly can not carry out the mortgage loan etc. of wholesale；If new client is used for a long time credit card, but it is each it is monthly on time also Money, credit value is higher, can carry out big amount loan；If new client does not use credit card etc., credit value is initial Value, then consider the moderate credit product etc. of loan limit.In the application, the credit product that new client can apply first is judged, Then the corresponding other credit product of customer class is selected in the credit product that can apply again, greatly increases the effect of recommendation, The credit product for facilitating new client accurately to select it that can apply.

Client's classification analytical equipment of the embodiment of the present application, the corresponding data of plurality of passages for obtaining new client are divided Analysis, it is ensured that it is more accurate to everyone assessment, and plurality of passages assessment can more comprehensively assess new client, avoid because For single channel personal information is played tricks and generates evaluation error, client's classification of new client is accurately obtained, and then recommend to be suitble to new visitor The product at family is recommended to resell the efficiency sold to improve.

Referring to Fig.1 0, a kind of computer equipment is also provided in the embodiment of the present application, which can be server, Its internal structure can be as shown in Figure 10.The computer equipment includes processor, the memory, network connected by system bus Interface and database.Wherein, the processor of the Computer Design is for providing calculating and control ability.The computer equipment is deposited Reservoir includes non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program And database.The internal memory provides environment for the operation of operating system and computer program in non-volatile memory medium.It should The database of computer equipment is used to store the channel data etc. of each channel acquisition.The network interface of the computer equipment is used for It is communicated with external terminal by network connection.To realize a kind of client's category analysis when the computer program is executed by processor Method.

Above-mentioned processor executes the step of above-mentioned client's classification analysis method and is：Respectively obtain plurality of passages on new client Relevant data；The data that every channel obtains are subjected to clustering processing respectively, are obtained one-to-one with the plurality of passages Multiple groups cluster data；Multiple groups cluster data is formed into sparse matrix, and by collaborative filtering method by the sparse matrix polishing, Form the primary vector matrix of the corresponding new client；By the primary vector matrix respectively with preset client's category database In multiple secondary vector matrixes carry out similarity calculation；It wherein, include multiple client's classifications in client's category database, and With the one-to-one secondary vector matrix of client's classification；It will be with the highest secondary vector matrix of the primary vector matrix similarity Corresponding client's classification is denoted as client's classification of the new client.

In one embodiment, above-mentioned will be corresponding with the highest secondary vector matrix of the primary vector matrix similarity Client's classification is denoted as after the other step of customer class of the new client, including：Search client's classification pair with the new client The product information answered；The product information is recommended into the new client.

In one embodiment, the above-mentioned the step of product information is recommended into the new client, including：By the production Product information forms diagrammatic form and recommends the new client, wherein the diagrammatic form includes the text of the product of product information The sales data figure of introduction and product.

In one embodiment, above-mentioned that multiple groups cluster data is formed into sparse matrix, and pass through collaborative filtering method for institute The step of stating sparse matrix polishing, forming the primary vector matrix of the corresponding new client, including：

Feature extraction is carried out to the data obtained by every channel respectively, obtains the corresponding multiple characteristics in every channel According to；It will be extracted in the corresponding multiple characteristics in every channel with the incoherent characteristic of other feature data, as Uncorrelated features data；By the corresponding data dump of the corresponding uncorrelated features data in every channel, and to every channel Data that are corresponding, leaving carry out clustering processing respectively, obtain and the one-to-one multiple groups cluster data of the plurality of passages.

In one embodiment, above-mentioned will be corresponding with the highest secondary vector matrix of the primary vector matrix similarity Client's classification is denoted as after the other step of customer class of the new client, including：Obtain the medical data of the new client；According to The medical data chooses the insurance product information that the new client is suitble to the insurance products of purchase；In the guarantor of the suitable purchase Filtered out in the insurance product information of dangerous product the corresponding new client the other insurance product information of customer class recommend it is described New client.

In one embodiment, above-mentioned that the insurance products that the new client is suitble to purchase are chosen according to the medical data The step of insurance product information, including：The medical data is subjected to feature extraction to obtain multiple medical characteristics data；More It is extracted in a medical characteristics data to other incoherent characteristics of medical characteristics data as uncorrelated medical characteristics number According to；The corresponding medical data of the uncorrelated medical characteristics data is removed, and is chosen according to the medical data remained at a school or university after graduation as a faculty member described new Client is suitble to the insurance product information of the insurance products of purchase.

In one embodiment, above-mentioned will be corresponding with the highest secondary vector matrix of the primary vector matrix similarity Client's classification is denoted as after the other step of customer class of the new client, including：Obtain the collage-credit data of the new client；According to The collage-credit data chooses the credit product information that the new client is suitble to the credit product of application；In the letter of the suitable purchase It borrows and filters out the corresponding other credit product information recommendation of customer class in the credit product information of product to the new client.

It will be understood by those skilled in the art that structure shown in Figure 10, only part relevant to application scheme The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme.

The computer equipment of the embodiment of the present application, the corresponding data of plurality of passages for obtaining new client are analyzed, it is ensured that It is more accurate to everyone assessment, and plurality of passages assessment can more comprehensively assess new client, avoid because of single-pass Road personal information plays tricks and generates evaluation error, accurately obtains client's classification of new client, and then recommends the production for being suitble to new client Product are recommended to resell the efficiency sold to improve.

One embodiment of the application also provides a kind of computer readable storage medium, is stored thereon with computer program, calculates Machine program realizes a kind of client's classification analysis method when being executed by processor, specially：It obtains respectively in plurality of passages and new objective The relevant data in family；The data that every channel obtains are subjected to clustering processing respectively, obtain corresponding with the plurality of passages Multiple groups cluster data；Multiple groups cluster data is formed into sparse matrix, and is mended the sparse matrix by collaborative filtering method Together, the primary vector matrix of the corresponding new client is formed；By the primary vector matrix respectively with preset client's classification number Similarity calculation is carried out according to multiple secondary vector matrixes in library；It wherein, include multiple client's classifications in client's category database, And with the one-to-one secondary vector matrix of client's classification；It will be with the highest secondary vector of primary vector matrix similarity The corresponding client's classification of matrix is denoted as client's classification of the new client.

Above-mentioned client's classification analysis method, the corresponding data of plurality of passages for obtaining new client are analyzed, it is ensured that every Personal assessment is more accurate, and plurality of passages assessment can more comprehensively assess new client, avoids because of single channel People's information plays tricks and generates evaluation error, accurately obtains client's classification of new client, and then recommends the product for being suitble to new client, with It improves and recommends to resell the efficiency sold.

In one embodiment, above-mentioned processor will be with the highest secondary vector matrix of the primary vector matrix similarity Corresponding client's classification is denoted as after the other step of customer class of the new client, including：Search the client with the new client The corresponding product information of classification；The product information is recommended into the new client.

In one embodiment, the step of product information will be recommended the new client by above-mentioned processor, including： Product information formation diagrammatic form is recommended into the new client, wherein the diagrammatic form includes the production of product information The character introduction of product and the sales data figure of product.

In one embodiment, multiple groups cluster data is formed sparse matrix by above-mentioned processor, and passes through collaborative filtering side Method is by the sparse matrix polishing, the step of forming the primary vector matrix of the corresponding new client, including：To logical by every The data that road obtains carry out feature extraction respectively, obtain the corresponding multiple characteristics in every channel；Every channel is corresponding It is extracted in multiple characteristics with the incoherent characteristic of other feature data, as uncorrelated features data；It will be every The corresponding data dump of the corresponding uncorrelated features data in channel, and data point that are corresponding to every channel, leaving Not carry out clustering processing, obtain and the one-to-one multiple groups cluster data of the plurality of passages.

In one embodiment, above-mentioned processor will be with the highest secondary vector matrix of the primary vector matrix similarity Corresponding client's classification is denoted as after the other step of customer class of the new client, including：Obtain the medical number of the new client According to；The insurance product information that the new client is suitble to the insurance products of purchase is chosen according to the medical data；It is suitble to described The other insurance product information of customer class that the corresponding new client is filtered out in the insurance product information of the insurance products of purchase pushes away It recommends to the new client.

In one embodiment, above-mentioned processor chooses the insurance that the new client is suitble to purchase according to the medical data The step of insurance product information of product, including：The medical data is subjected to feature extraction to obtain multiple medical characteristics numbers According to；It is extracted in multiple medical characteristics data to other incoherent characteristics of medical characteristics data as uncorrelated medical treatment Characteristic；The corresponding medical data of the uncorrelated medical characteristics data is removed, and is chosen according to the medical data remained at a school or university after graduation as a faculty member The new client is suitble to the insurance product information of the insurance products of purchase.

In one embodiment, above-mentioned processor will be with the highest secondary vector matrix of the primary vector matrix similarity Corresponding client's classification is denoted as after the other step of customer class of the new client, including：Obtain the reference number of the new client According to；The credit product information that the new client is suitble to the credit product of application is chosen according to the collage-credit data；It is suitble to described The corresponding other credit product information recommendation of customer class is filtered out in the credit product information of the credit product of purchase to described New client.

Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, Any reference used in provided herein and embodiment to memory, storage, database or other media, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double speed are according to rate SDRAM (SSRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..

The foregoing is merely preferred embodiment of the present application, are not intended to limit the scope of the patents of the application, all utilizations Equivalent structure or equivalent flow shift made by present specification and accompanying drawing content is applied directly or indirectly in other correlations Technical field, similarly include in the scope of patent protection of the application.

Claims

1. a kind of client's classification analysis method, which is characterized in that including：

The data that every channel obtains are subjected to clustering processing respectively, obtain clustering with the one-to-one multiple groups of the plurality of passages Data；

Multiple groups cluster data is formed into sparse matrix, and the sparse matrix polishing is formed and corresponded to by collaborative filtering method The primary vector matrix of the new client；

The primary vector matrix is similar to multiple secondary vector matrix progress in preset client's category database respectively Degree calculates；Wherein, in client's category database include multiple client's classifications, and with the one-to-one secondary vector of client's classification Matrix；

Client's classification corresponding with the highest secondary vector matrix of the primary vector matrix similarity is denoted as the new client Client's classification.

2. client's classification analysis method according to claim 1, which is characterized in that it is described will be with the primary vector matrix The corresponding client's classification of the highest secondary vector matrix of similarity is denoted as after the other step of customer class of the new client, packet It includes：

The product information is recommended into the new client.

3. client's classification analysis method according to claim 2, which is characterized in that described to recommend the product information The step of new client, including：

Product information formation diagrammatic form is recommended into the new client, wherein the diagrammatic form includes product information Product character introduction and product sales data figure.

4. client's classification analysis method according to claim 1, which is characterized in that it is described multiple groups cluster data is formed it is dilute Matrix is dredged, and the sparse matrix polishing is formed by the primary vector matrix of the corresponding new client by collaborative filtering method The step of, including：

Feature extraction is carried out to the data obtained by every channel respectively, obtains the corresponding multiple characteristics in every channel；

By the corresponding data dump of the corresponding uncorrelated features data in every channel, and it is corresponding to every channel, leave Data carry out clustering processing respectively, obtain and the one-to-one multiple groups cluster data of the plurality of passages.

5. client's classification analysis method according to claim 1, which is characterized in that it is described will be with the primary vector matrix The corresponding client's classification of the highest secondary vector matrix of similarity is denoted as after the other step of customer class of the new client, packet It includes：

Obtain the medical data of the new client；

The customer class that the corresponding new client is filtered out in the insurance product information of the insurance products of the suitable purchase is other Insurance product information recommends the new client.

6. client's classification analysis method according to claim 5, which is characterized in that described to be chosen according to the medical data The new client is suitble to the step of insurance product information of the insurance products of purchase, including：

It is extracted in multiple medical characteristics data to other incoherent characteristics of medical characteristics data as uncorrelated doctor Treat characteristic；

The corresponding medical data of the uncorrelated medical characteristics data is removed, and is chosen according to the medical data remained at a school or university after graduation as a faculty member described new Client is suitble to the insurance product information of the insurance products of purchase.

7. client's classification analysis method according to claim 1, which is characterized in that it is described will be with the primary vector matrix The corresponding client's classification of the highest secondary vector matrix of similarity is denoted as after the other step of customer class of the new client, packet It includes：

Obtain the collage-credit data of the new client；

The corresponding other credit product of customer class is filtered out in the credit product information of the credit product of the suitable purchase Information recommendation gives the new client.

8. a kind of client's classification analytical equipment, which is characterized in that including：

Cluster cell, the data for obtaining every channel carry out clustering processing respectively, obtain with the plurality of passages one by one Corresponding multiple groups cluster data；

Vectorization unit, for multiple groups cluster data to be formed sparse matrix, and by collaborative filtering method by the sparse square Battle array polishing forms the primary vector matrix of the corresponding new client；

Computing unit, for by the primary vector matrix respectively with multiple secondary vectors in preset client's category database Matrix carries out similarity calculation；It wherein, include multiple client's classifications in client's category database, and right one by one with client's classification The secondary vector matrix answered；

Selecting unit, for remembering client's classification corresponding with the highest secondary vector matrix of the primary vector matrix similarity For client's classification of the new client.

9. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists In the step of processor realizes any one of claims 1 to 7 the method when executing the computer program.

10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The step of method described in any one of claims 1 to 7 is realized when being executed by processor.