WO2019223082A1 - 客户类别分析方法、装置、计算机设备和存储介质 - Google Patents

客户类别分析方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2019223082A1
WO2019223082A1 PCT/CN2018/095482 CN2018095482W WO2019223082A1 WO 2019223082 A1 WO2019223082 A1 WO 2019223082A1 CN 2018095482 W CN2018095482 W CN 2018095482W WO 2019223082 A1 WO2019223082 A1 WO 2019223082A1
Authority
WO
WIPO (PCT)
Prior art keywords
customer
data
new
product information
vector matrix
Prior art date
Application number
PCT/CN2018/095482
Other languages
English (en)
French (fr)
Inventor
金戈
徐亮
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019223082A1 publication Critical patent/WO2019223082A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Definitions

  • the present application relates to the field of computers, and in particular, to a customer category analysis method, apparatus, computer equipment, and storage medium.
  • customers are classified by a single channel, that is, a single channel information of the customer is obtained, and then data analysis is performed. Finally, the customer is classified into the corresponding customer category. Since the data is obtained through a single channel, if there is little data on the channel , Or the existence of fraudulent data, etc., has a great impact on the accuracy of customer category analysis results.
  • the main purpose of this application is to provide a customer category analysis method, device, computer equipment, and storage medium, and to improve the accuracy of the customer category analysis results.
  • this application proposes to first propose a customer category analysis method, including:
  • the customer category corresponding to the second vector matrix with the highest similarity to the first vector matrix is recorded as the customer category of the new customer.
  • This application also provides a customer category analysis device, including:
  • An acquisition unit for acquiring data related to a new customer on multiple channels, respectively;
  • a clustering unit configured to perform clustering processing on the data obtained by each channel separately to obtain multiple sets of clustering data corresponding to the multiple channels one to one;
  • a vectorization unit configured to form multiple sets of clustering data into a sparse matrix, and supplement the sparse matrices through a collaborative filtering method to form a first vector matrix corresponding to the new customer;
  • a calculation unit configured to calculate similarity between the first vector matrix and a plurality of second vector matrices in a preset customer category database respectively; wherein the customer category database includes a plurality of customer categories, and A corresponding second vector matrix;
  • a selection unit is configured to record the customer category corresponding to the second vector matrix with the highest similarity to the first vector matrix as the customer category of the new customer.
  • the present application further provides a computer device including a memory and a processor, where the memory stores computer-readable instructions, and when the processor executes the computer-readable instructions, implements the steps of any of the foregoing methods.
  • the present application also provides a computer non-volatile readable storage medium having computer-readable instructions stored thereon, which are executed by a processor to implement the steps of the method according to any one of the foregoing.
  • the customer category analysis method, device, computer equipment, and storage medium of this application obtain data from multiple channels of new customers for analysis, ensuring that each person's evaluation is more accurate, and the multiple channel evaluation can more comprehensively evaluate new customers. , To avoid the bias caused by the fraud of the single-channel personal information, to accurately derive the customer category of new customers, and then recommend products suitable for new customers to improve the efficiency of resale recommendations.
  • FIG. 1 is a schematic flowchart of a customer category analysis method according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a customer category analysis method according to an embodiment of the present application
  • FIG. 3 is a detailed flowchart of step S2 of the above-mentioned customer category analysis method according to an embodiment of the present application;
  • FIG. 4 is a schematic block diagram of a structure of a customer category analysis device according to an embodiment of the present application.
  • FIG. 5 is a schematic block diagram of a structure of a customer category analysis device according to an embodiment of the present application.
  • FIG. 6 is a schematic block diagram of a structure of a first recommendation unit according to an embodiment of the present application.
  • FIG. 7 is a schematic block diagram of a cluster unit according to an embodiment of the present application.
  • FIG. 8 is a schematic block diagram of a structure of a customer category analysis device according to an embodiment of the present application.
  • FIG. 9 is a schematic block diagram of a structure of a customer category analysis device according to an embodiment of the present application.
  • FIG. 10 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • an embodiment of the present application proposes a customer category analysis method, including steps:
  • the multiple channels refer to two or more channels, and the channels refer to channels for obtaining data, such as a game channel, an online interaction channel, a consumption channel, and a social communication channel.
  • the main ways to obtain data for each channel include: purchase data, crawling data through crawler technology, and so on.
  • four channels are selected, specifically a game channel, an online interactive channel (WeChat), a consumption channel (Alipay), and a social communication channel (Weibo).
  • the channel data of the game, WeChat and Alipay can be authorized by the parties.
  • Weibo data can be obtained through crawler technology.
  • Weibo data can also be purchased.
  • the above game channels generally use WeChat games as the data of the game channels.
  • other games such as NetEase games and Shanda games can also be used as the data of the game channels.
  • the above game channel data mainly includes game consumption data, game time data, etc .
  • the above WeChat channel data mainly includes data of friends circle (including the number of posted friends circle, number of friends circle, number of long-term interactions, published content of friends circle, circle of friends Content published by others in China, etc.);
  • the data of the above Alipay channel mainly includes consumption record data, consumption venue data, consumption type data, etc .;
  • the data of Weibo channel mainly includes the content of Weibo posts, follow records, and follow Weibo posts. Content, etc.
  • more channel data may be obtained, such as traffic data such as vehicle selection, travel frequency, etc., dining channel data such as dining consumption, dining type, and dining time.
  • the above-mentioned clustering process is to perform clustering on the data of each channel separately.
  • the clustering algorithm selects the K-means clustering algorithm: Initialize Changshu K, randomly select the initial point as the centroid; classify the data points. Go to the nearest center point; recalculate the center of mass; repeat the first two steps until the center of mass is unchanged. Because the K-means clustering algorithm is an existing clustering algorithm, the specific clustering process is not repeated here.
  • This application uses a K-means clustering algorithm, which is fast and simple; it has high efficiency and scalability for large data sets, the time complexity is nearly linear, and it is suitable for mining large-scale data sets.
  • the sparse matrix refers to that the number of non-zero elements in the matrix is much smaller than the total number of matrix elements, and the distribution of non-zero elements is irregular.
  • the clustering processing of multiple groups of data described above reduces the amount of data obtained.Because there are certain differences in the types and sources of data, the clustering results of each group are substituted into a preset matrix, and the distribution of each non-zero element is There is no regularity and small correlation, and then a sparse matrix is formed.
  • the above data includes four groups, and one group of clustering results is used as the first row of the sparse matrix, and the other three groups of clustering results are used as the second, Three or four lines.
  • the above collaborative filtering method is to zero-fill the gaps between the non-zero elements in the sparse matrix, that is, because the clustering results of each group of data are different, in order to correspond to the data, it is necessary to add zeros between the non-zero elements to compensate Bits to obtain the above-mentioned first vector matrix.
  • the first vector matrix contains the data characteristics of the above-mentioned multiple channels. Furthermore, in the subsequent use process, the overall judgment will not be affected because the data of a certain channel has been tampered with.
  • the second vector matrix is a vector matrix pre-arranged according to historical customers. Because the types of historical customers have been confirmed, the data obtained by each historical customer through the same channel can also generate the first vector matrix described above through the process of steps S1-S3, except that the corresponding customer category is known and obtained. A plurality of first vector matrices corresponding to historical customers of each customer category, and then an average process is performed on the plurality of first vector matrices corresponding to the same customer category to obtain a second vector matrix corresponding to the customer categories.
  • the classification of the customer categories of the above historical customers is classified using learning vector quantification, and the specific process is as follows:
  • the learning goal of LVQ is to obtain k vectors: q 1 , q 2 , ..., q k , where q represents each learning goal;
  • each vector corresponds to a region, and the sample points in the region belong to the class of the vector, and then the customer category is obtained.
  • the first vector matrix is a vector proof for the customers to be classified
  • the calculation of the similarity can be The calculation is performed using an algorithm such as Euclidian distance, Manhattan distance, Minkowski distance, or cosine similarity.
  • step S5 the customer category of the new customer is marked, so as to facilitate subsequent recommendation of the product to the new customer and the like.
  • step S5 of recording the customer category corresponding to the second vector matrix with the highest similarity to the first vector matrix as the customer category of the new customer the method includes:
  • the product information corresponding to the customer category is the product information of more products purchased by the customer category. Because which types of products are purchased in a larger number in which customer category, there are data records, so you can Easy to get product information corresponding to new customers.
  • a method for finding product information corresponding to a customer category of a new customer specifically includes: (1) searching a preset database for all product records purchased by customers of the customer category corresponding to the new customer; (2) then Find the product information that meets the requirements in the product record.
  • the compliance requirement refers to the product information in the order of more to less and ranked before the specified rank; (3) Recommend the above-mentioned product information to the new users.
  • the recommended methods include email, WeChat, and SMS.
  • the above-mentioned step S7 of recommending the product information to the new customer includes:
  • the sales data graph of the product may be a graph that can represent data, such as a histogram, a graph, and an area graph.
  • the product information recommended to new customers may include multiple products, and the sales data of different products is different. After visualizing the sales data, the new customer can intuitively identify which product has the highest sales volume, and improve the new customers to view the recommended content. s efficiency.
  • the above-mentioned step S2 of forming a plurality of sets of clustering data into a sparse matrix and complementing the sparse matrix by a collaborative filtering method to form a first vector matrix corresponding to the new customer includes: :
  • S21 Perform feature extraction on the data obtained through each channel, and obtain multiple feature data corresponding to each channel;
  • the data corresponding to the unrelated feature data corresponding to each channel is cleared, and the remaining data corresponding to each channel is clustered separately to obtain multiple groups corresponding to the multiple channels on a one-to-one basis. Clustering data.
  • the data obtained through each channel are separately extracted to obtain multiple sets of feature data corresponding to each channel, and each set of feature data includes a plurality; then for each set of features Correlation analysis was performed on the data to find feature data that is not related to other feature data in each set of feature data, and recorded the feature data as uncorrelated feature data. Because it is not related to other feature data, it is not relevant. Corresponding data may have problems, so data that may have problems is cleared in advance to improve the accuracy of subsequent clustering results.
  • the feature extraction of the data can use the ReliefF algorithm.
  • the ReliefF algorithm is the Kononeill algorithm in 1994 (Relief algorithm is a feature weighting algorithm).
  • the features are given different weights according to the correlation of each feature and category. (Features with weights less than a certain threshold will be removed). Compared with the Relief algorithm, the algorithm can be used to improve multi-class problems. Because the ReliefF algorithm is a known algorithm, it will not be described again. The process of data feature extraction.
  • step S5 of recording the customer category corresponding to the second vector matrix with the highest similarity to the first vector matrix as the customer category of the new customer the method includes:
  • the above medical data mainly includes the social security card usage data of new customers at the hospital and their electronic case data.
  • the medical data can be used to determine the physical condition of the new customer. . According to their physical conditions, first select the insurance product suitable for purchase in the insurance product information database, and then find out the insurance product information corresponding to the customer category in the insurance product information suitable for purchase, and recommend the found insurance product information to the new client.
  • the above-mentioned screening of insurance products suitable for purchase in the insurance product information database according to their physical conditions means that, because of different physical conditions, different insurance products cannot be purchased.
  • the insurance product information recommended to new customers can be the insurance product information corresponding to the insurance products that are sold the most for this customer category, or it can be ranked according to the sales volume of this customer category (the more sales, the more reliable the ranking Before) Information on insurance products corresponding to insurance products before the designated ranking.
  • step S502 of selecting insurance product information of an insurance product suitable for purchase by the new customer according to the medical data includes:
  • S5021 Perform feature extraction on the medical data to obtain multiple medical characteristic data
  • S5023 Clear medical data corresponding to the irrelevant medical characteristic data, and select insurance product information of the insurance product suitable for the new customer to purchase according to the medical data of the university.
  • steps S5021, S5022, and S5023 above there may be data related to insurance fraud in medical data, and these data are generally different from conventional data, for example, intentionally using a social security card to buy drugs and then selling the drugs to other people In the mall, there are certain rules for the swipe frequency and swipe amount of their social security cards. For example, the medicines purchased each time are different, but the amount is the same.
  • the correlation of these medical data is low, so when performing a correlation analysis on their characteristic data, they can either be extracted as irrelevant medical characteristic data, and then the medical data corresponding to the irrelevant medical characteristic data is cleared, and the retained Of medical data to determine insurance product information for insurance products that new customers can purchase.
  • the facial features of a new customer it is also possible to obtain the facial features of a new customer and then input the facial features into preset different disease prediction models (the disease prediction model is based on a large number of different human face features And the same disease training model corresponding to each face feature. After inputting new face features, it will output the results corresponding to whether the face features have the disease.) Judgment to determine whether the new customer has the corresponding Disease, and then provide adapted insurance product information for new customers to choose.
  • step S5 of recording the customer category corresponding to the second vector matrix with the highest similarity to the first vector matrix as the customer category of the new customer the method includes:
  • the credit product information corresponding to the customer category is selected from the credit product information of the credit products suitable for purchase and recommended to the new customer.
  • the above credit products include small loans, mortgage loans, home purchase loans, etc .; the above credit reference data refers to the credibility of new customers in the banking system. For example, if a new customer fails to pay back the credit card on time many times, its creditworthiness is low and it may not be possible to make a large mortgage. If a new customer uses a credit card for a long time, but repays it on time every month, its creditworthiness is high. Make a large loan; if a new customer does not use a credit card, etc., and its credit value is the initial value, consider a credit product with a moderate loan amount. In this application, first determine the credit products that new customers can apply for, and then select the credit products corresponding to the customer category among the available credit products, which greatly improves the recommendation effect and facilitates new customers to accurately select the credit products that they can apply for. .
  • the customer category analysis method in the embodiment of the present application acquires data corresponding to multiple channels of new customers for analysis, ensuring that each person's evaluation is more accurate, and the multiple channel evaluation can more comprehensively evaluate new customers, avoiding a single channel individual
  • the information is false and an evaluation bias is generated, and the customer category of the new customer is accurately obtained, and then products suitable for the new customer are recommended, so as to improve the efficiency of recommending resale.
  • an embodiment of the present application provides a customer category analysis device, including steps:
  • An obtaining unit 10 configured to obtain data related to a new customer on multiple channels
  • the clustering unit 20 is configured to perform clustering processing on the data obtained by each channel separately to obtain multiple groups of clustering data corresponding to the multiple channels one to one;
  • a vectorization unit 30 configured to form multiple sets of clustering data into a sparse matrix, and complete the sparse matrix through a collaborative filtering method to form a first vector matrix corresponding to the new customer;
  • a computing unit 40 is configured to perform similarity calculations on the first vector matrix with a plurality of second vector matrices in a preset customer category database, wherein the customer category database includes multiple customer categories, and One-to-one corresponding second vector matrix;
  • the selecting unit 50 is configured to record the customer category corresponding to the second vector matrix with the highest similarity to the first vector matrix as the customer category of the new customer.
  • the multiple channels refer to two or more channels, and the channels refer to channels for acquiring data, such as a game channel, an online interaction channel, a consumption channel, and a social communication channel.
  • the main ways to obtain data for each channel include: purchase data, crawling data through crawler technology, and so on.
  • four channels are selected, specifically a game channel, an online interactive channel (WeChat), a consumption channel (Alipay), and a social communication channel (Weibo).
  • the channel data of the game, WeChat and Alipay can be authorized by the parties.
  • Weibo data can be obtained through crawler technology.
  • Weibo data can also be purchased.
  • the above game channels generally use WeChat games as the data of the game channels.
  • other games such as NetEase games and Shanda games can also be used as the data of the game channels.
  • the above game channel data mainly includes game consumption data, game time data, etc .
  • the above WeChat channel data mainly includes data of friends circle (including the number of posted friends circle, number of friends circle, number of long-term interactions, published content of friends circle, circle of friends Content published by others in China, etc.);
  • the data of the above Alipay channel mainly includes consumption record data, consumption venue data, consumption type data, etc .;
  • the data of Weibo channel mainly includes the content of Weibo posts, follow records, and follow Weibo posts. Content, etc.
  • more channel data may be obtained, such as traffic data such as vehicle selection, travel frequency, etc., dining channel data such as dining consumption, dining type, and dining time.
  • the above-mentioned clustering processing is to cluster the data of each channel separately, and the clustering algorithm selects the K-means clustering algorithm: initialize Changshu K, randomly select the initial point as the centroid; and return the data points to Class to the nearest center point; recalculate the centroid; repeat the first two steps until the centroid is unchanged.
  • K-means clustering algorithm is an existing clustering algorithm, the specific clustering process will not be repeated here.
  • This application uses a K-means clustering algorithm, which is fast and simple; it has high efficiency and scalability for large data sets, the time complexity is nearly linear, and it is suitable for mining large-scale data sets.
  • the sparse matrix refers to that the number of non-zero elements in the matrix is much smaller than the total number of matrix elements, and the distribution of non-zero elements is irregular.
  • the clustering processing of multiple groups of data described above reduces the amount of data obtained.Because there are certain differences in the types and sources of data, the clustering results of each group are substituted into a preset matrix, and the distribution of each non-zero element is There is no regularity and small correlation, and then a sparse matrix is formed.
  • the above data includes four groups, and one group of clustering results is used as the first row of the sparse matrix, and the other three groups of clustering results are used as the second, Three or four lines.
  • the above collaborative filtering method is to zero-fill the gaps between the non-zero elements in the sparse matrix, that is, because the clustering results of each group of data are different, in order to correspond to the data, it is necessary to add zeros between the non-zero elements to compensate Bits to obtain the above-mentioned first vector matrix.
  • the first vector matrix contains the data characteristics of the above-mentioned multiple channels. Furthermore, in the subsequent use process, the overall judgment will not be affected because the data of a certain channel has been tampered with.
  • the second vector matrix is a vector matrix arranged in advance according to historical customers. Because the types of historical customers have been confirmed, then the data obtained by each historical customer through the same channel described above can also generate the above-mentioned first vector matrix through the task process performed by the above-mentioned obtaining unit 10, clustering unit 20, and vectorization unit 30, but The corresponding customer category is known. Obtain multiple first vector matrices corresponding to the historical customers of each customer category, and then average the multiple first vector matrices corresponding to the same customer category to obtain the corresponding customer categories. Of the second vector matrix. The classification of the customer categories of the above historical customers is classified using learning vector quantification, and the specific process is as follows:
  • the learning goal of LVQ is to obtain k vectors: q 1 , q 2 , ..., q k , where q represents each learning goal;
  • each vector corresponds to a region, and the sample points in the region belong to the class of the vector, and then the customer category is obtained.
  • the first vector matrix is a vector proof for the customers to be classified
  • the calculation of the similarity can be The calculation is performed using an algorithm such as Euclidian distance, Manhattan distance, Minkowski distance or cosine similarity.
  • the customer category of the new customer is marked, so as to facilitate the subsequent recommendation of products to the new customer and the like.
  • the above-mentioned customer category analysis device further includes:
  • a searching unit 60 configured to search for product information corresponding to a customer category of the new customer
  • the first recommendation unit 70 is configured to recommend the product information to the new customer.
  • the above-mentioned search unit 60 and the first recommendation unit 70 are devices that perform recommendation of product information to new customers according to the type of the new customers, so as to improve the efficiency of recommendation resale.
  • the product information corresponding to the customer category is the product information of more products purchased by the customer category. Because which types of products are purchased in a larger number in which customer category, there are data records, so you can Easy to get product information corresponding to new customers.
  • a method for finding product information corresponding to a customer category of a new customer specifically includes: (1) searching a preset database for all product records purchased by customers of the customer category corresponding to the new customer; (2) then Find the product information that meets the requirements in the product record.
  • the compliance requirement refers to the product information in the order of more to less and ranked before the specified rank; (3) Recommend the above-mentioned product information to the new users.
  • the recommended methods include email, WeChat, and SMS.
  • the first recommendation unit 70 includes:
  • the chart recommendation module 71 is configured to recommend the product information to the new customer in a chart form, wherein the chart form includes a text introduction of the product with product information and a sales data chart of the product.
  • the sales data graph of the product may be a graph that can represent data, such as a histogram, a graph, and an area graph.
  • the product information recommended to new customers may include multiple products, and the sales data of different products is different. After visualizing the sales data, the new customer can intuitively identify which product has the highest sales volume, and improve the new customers to view the recommended content. s efficiency.
  • the clustering unit 20 includes:
  • a first feature extraction module 21 is configured to perform feature extraction on data obtained through each channel, and obtain multiple feature data corresponding to each channel;
  • a first correlation analysis module 22 configured to extract feature data that is not related to other feature data from a plurality of feature data corresponding to each channel, and use the feature data as uncorrelated feature data;
  • the first clearing clustering module 23 is configured to clear data corresponding to the unrelated feature data corresponding to each channel, and perform clustering processing on the remaining data corresponding to each channel to obtain the multiple Multiple groups of clustering data corresponding to each channel.
  • first feature extraction module 21, first correlation analysis module 22, and first clearing clustering module 23 the data obtained through each channel are respectively subjected to feature extraction to obtain multiple sets of feature data corresponding to each channel.
  • a set of feature data includes multiples; then correlation analysis is performed on each set of feature data to find feature data that is not related to other feature data in each set of feature data, and the feature data is recorded as uncorrelated feature data Because there is no correlation with other feature data, there may be problems with the data corresponding to the unrelated feature data, so the data that may have problems may be removed in advance to improve the accuracy of subsequent clustering results.
  • the feature extraction of the data can use the ReliefF algorithm.
  • the ReliefF algorithm is the Kononeill algorithm in 1994 (Relief algorithm is a feature weighting algorithm).
  • the features are given different weights according to the correlation of each feature and category. (Features with weights less than a certain threshold will be removed). Compared with the Relief algorithm, the algorithm can be used to improve multi-class problems. Because the ReliefF algorithm is a known algorithm, it will not be described again. The process of data feature extraction.
  • the above-mentioned customer category analysis device further includes:
  • a medical data acquisition unit 501 configured to acquire medical data of the new customer
  • a first selecting unit 502 configured to select insurance product information of an insurance product suitable for purchase by the new customer according to the medical data
  • a second recommendation unit 503 is configured to filter out insurance product information corresponding to the customer category of the new customer from among the insurance product information of the insurance product suitable for purchase to the new customer.
  • the actions performed in the medical data acquisition unit 501, the first selection unit 502, and the second recommendation unit 503 are mainly used in the sales scenario of insurance products.
  • the above medical data mainly includes the social security card usage data of new customers in hospitals and their electronic cases. Data, medical data can be used to determine the physical condition of new customers. According to their physical conditions, first select the insurance product suitable for purchase in the insurance product information database, and then find out the insurance product information corresponding to the customer category in the insurance product information suitable for purchase, and recommend the found insurance product information to the new client.
  • the above-mentioned screening of insurance products suitable for purchase in the insurance product information database according to their physical conditions means that, because of different physical conditions, different insurance products cannot be purchased.
  • the insurance product information recommended to new customers can be the insurance product information corresponding to the insurance products that are sold the most for this customer category, or it can be ranked according to the sales volume of this customer category (the more sales, the more reliable the ranking Before) Information on insurance products corresponding to insurance products before the designated ranking.
  • the first selecting unit 502 includes:
  • a second feature extraction module configured to perform feature extraction on the medical data to obtain multiple medical feature data
  • a second correlation analysis module configured to extract feature data that is not related to other medical feature data from the plurality of medical feature data as unrelated medical feature data;
  • the second clearing clustering module is configured to clear the medical data corresponding to the unrelated medical characteristic data, and select insurance product information of the insurance product suitable for the new customer to purchase according to the medical data of the university.
  • second feature extraction module there may be related data of fraud protection in medical data
  • these data generally have certain differences from conventional data, for example, intentional
  • the social security card is used to buy medicines and then sell the drugs to other malls.
  • the social security card's swipe frequency and credit card amount have certain rules. For example, each drug purchase is different, but the amount is equal, and the credit card purchase will be performed at a certain interval. Medicine, etc.
  • the correlation of these medical data is low, so when performing a correlation analysis on their characteristic data, they can either be extracted as irrelevant medical characteristic data, and then the medical data corresponding to the irrelevant medical characteristic data is cleared, and the retained Of medical data to determine insurance product information for insurance products that new customers can purchase.
  • a disease judging unit may also be provided.
  • the disease prediction model is based on a large number of different Human face features and the same disease training model corresponding to each face feature. After inputting new face features, it will output the results corresponding to whether the face features suffer from the disease) to determine new customers. Whether or not they have corresponding diseases, and then provide adapted insurance product information for new customers to choose.
  • the above-mentioned customer category analysis device further includes:
  • Credit information acquisition unit 511 configured to acquire credit information of the new customer
  • a second selecting unit 512 configured to select the credit product information of the credit product suitable for the new customer according to the credit information
  • a third recommendation unit 513 is configured to filter out credit product information corresponding to the customer category from the credit product information of the credit product suitable for purchase and recommend it to the new customer.
  • the actions specified in the above credit information acquisition unit 511, the second selection unit 512, and the third recommendation unit 513 are mainly used in the scenario of financial loans.
  • the above credit products include small loans, mortgage loans, home purchase loans, etc .; the above credit information Data refers to the creditworthiness of new customers in the banking system. For example, new customers have not repaid their credit cards on time for many times, their creditworthiness is low, and large mortgages may not be available. If new customers use credit cards for a long time, but each The monthly repayments are on time, and their creditworthiness is high, so they can make large loans; if new customers do not use credit cards, etc., and their creditworthiness is the initial value, consider credit products with a moderate loan amount. In this application, first determine the credit products that new customers can apply for, and then select the credit products corresponding to the customer category among the available credit products, which greatly improves the recommendation effect and facilitates new customers to accurately select the credit products that they can apply for. .
  • the customer category analysis device in the embodiment of the present application obtains data corresponding to multiple channels of a new customer for analysis, ensuring that each person's evaluation is more accurate, and the multiple channel evaluation can more comprehensively evaluate new customers, avoiding a single channel individual
  • the information is false and an evaluation bias is generated, and the customer category of the new customer is accurately obtained, and then products suitable for the new customer are recommended, so as to improve the efficiency of recommending resale.
  • the computer device may be a server, and its internal structure may be as shown in FIG. 10.
  • the computer device includes a processor, a memory, a network interface, and a database connected through a system bus.
  • the computer design processor is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer-readable instructions, and a database.
  • the memory provides an environment for operating systems and computer-readable instructions in a non-volatile storage medium.
  • the database of the computer equipment is used to store channel data obtained by each channel.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by a processor to implement a customer category analysis method.
  • the steps of the processor executing the customer category analysis method are: obtaining data related to new customers on multiple channels separately; and performing cluster processing on the data obtained from each channel to obtain a one-to-one correspondence with the multiple channels. Multiple sets of clustering data; forming multiple sets of clustering data into a sparse matrix, and supplementing the sparse matrices through a collaborative filtering method to form a first vector matrix corresponding to the new customer; Similarity calculation is performed on a plurality of second vector matrices in a preset customer category database.
  • the customer category database includes a plurality of customer categories and a second vector matrix corresponding to the customer categories one to one. The customer category corresponding to the second vector matrix with the highest vector matrix similarity is recorded as the customer category of the new customer.
  • the method includes: finding customers with the new customer Product information corresponding to the category; recommending the product information to the new customer.
  • the step of recommending the product information to the new customer includes: recommending the product information to the new customer in a graphic form, wherein the graphic form includes product information of the product. Text introduction, and product sales data chart.
  • the steps of forming multiple sets of clustering data into a sparse matrix, and completing the sparse matrix through a collaborative filtering method to form a first vector matrix corresponding to the new customer include:
  • Feature extraction is performed on the data obtained through each channel to obtain multiple feature data corresponding to each channel; among the multiple feature data corresponding to each channel, feature data that is not related to other feature data is extracted as uncorrelated Characteristic data; clearing the data corresponding to the unrelated characteristic data corresponding to each channel, and performing clustering processing on the remaining data corresponding to each channel to obtain a plurality of one-to-one correspondences with the plurality of channels Group clustering data.
  • the method includes: obtaining medical data of the new customer Selecting the insurance product information of the insurance product suitable for the new customer according to the medical data; filtering out the insurance product information corresponding to the customer category of the new customer from the insurance product information of the suitable insurance product to recommend to The new customer.
  • the above-mentioned step of selecting insurance product information of an insurance product suitable for purchase by the new customer according to the medical data includes: performing feature extraction on the medical data to obtain a plurality of medical characteristic data; Feature data that is not related to other medical feature data is extracted from the medical feature data as unrelated medical feature data; the medical data corresponding to the unrelated medical feature data is cleared, and the new client is selected according to the medical data of the university Insurance product information for purchased insurance products.
  • the method includes: obtaining credit information of the new customer Data; selecting the credit product information of the credit product suitable for the new customer according to the credit data; selecting the credit product information corresponding to the customer category from the credit product information of the credit product suitable for purchase and recommending to the company Describe new customers.
  • FIG. 10 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied.
  • the computer equipment in the embodiment of the present application acquires and analyzes data corresponding to multiple channels of new customers to ensure that each person's assessment is more accurate, and multiple channel assessments can more comprehensively evaluate new customers, avoiding fraud due to single channel personal information An evaluation bias is generated, and the customer category of the new customer is accurately obtained, and then products suitable for the new customer are recommended, so as to improve the efficiency of recommending resale.
  • An embodiment of the present application further provides a computer non-volatile readable storage medium, which stores computer-readable instructions.
  • a computer non-volatile readable storage medium which stores computer-readable instructions.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Technology Law (AREA)
  • Game Theory and Decision Science (AREA)
  • Human Resources & Organizations (AREA)
  • Operations Research (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种客户类别分析方法、装置、计算机设备和存储介质,获取新客户的多条通道对应的数据进行分析,确保对每个人的评估更加准确,而且多条通道评估可以更加全面的评估新客户,避免因为单通道个人信息作假而产生评估偏差,准确得出新客户的客户类别,进而推荐适合新客户的产品,以提高推荐转售卖的效率。

Description

客户类别分析方法、装置、计算机设备和存储介质
本申请要求于2018年5月25日提交中国专利局、申请号为2018105457933,申请名称为“客户类别分析方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及到计算机领域,特别是涉及到一种客户类别分析方法、装置、计算机设备和存储介质。
背景技术
保险、投资理财的时候,会有相关的系统进行统计与计算,比如对客户进行分类等。现在业内通过单个通道对客户进行分类,即获取客户的单一通道信息,然后进行数据分析,最后将该客户划分到对应的客户类别中,由于数据是单一通到获取,如果该通道上的数据少,或者数据存在造假等,对客户类别分析结果的准确性影响非常大。
技术问题
本申请的主要目的为提供一种客户类别分析方法、装置、计算机设备和存储介质,旨在提高客户类别分析结果的准确性。
技术解决方案
为了实现所述发明目的,本申请提出首先提出一种客户类别分析方法,包括:
分别获取多条通道上与新客户相关的数据;
将每条通道获取的数据分别进行聚类处理,得到与所述多条通道一一对应的多组聚类数据;
将多组聚类数据形成稀疏矩阵,并通过协同过滤方法将所述稀疏矩阵补齐,形成对应所述新客户的第一向量矩阵;
将所述第一向量矩阵分别与预设的客户类别数据库中的多个第二向量矩阵进行相似度计算;其中,客户类别数据库中包括多个客户类别,以及与客户类别一一对应的第二向量矩阵;
将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别。
本申请还提供一种客户类别分析装置,包括:
获取单元,用于分别获取多条通道上与新客户相关的数据;
聚类单元,用于将每条通道获取的数据分别进行聚类处理,得到与所述多条通道一一对应的多组聚类数据;
向量化单元,用于将多组聚类数据形成稀疏矩阵,并通过协同过滤方法将所述稀疏矩阵补齐,形成对应所述新客户的第一向量矩阵;
计算单元,用于将所述第一向量矩阵分别与预设的客户类别数据库中的多个第二向量矩阵进行相似度计算;其中,客户类别数据库中包括多个客户类别,以及与客户类别一一对应的第二向量矩阵;
选择单元,用于将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户 的客户类别。
本申请还提供一种计算机设备,包括存储器和处理器,所述存储器存储有计算机可读指令,所述处理器执行所述计算机可读指令时实现上述任一项所述方法的步骤。
本申请还提供一种计算机非易失性可读存储介质,其上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现上述任一项所述的方法的步骤。
有益效果
本申请的客户类别分析方法、装置、计算机设备和存储介质,获取新客户的多条通道对应的数据进行分析,确保对每个人的评估更加准确,而且多条通道评估可以更加全面的评估新客户,避免因为单通道个人信息作假而产生评估偏差,准确得出新客户的客户类别,进而推荐适合新客户的产品,以提高推荐转售卖的效率。
附图说明
图1为本申请一实施例的客户类别分析方法的流程示意图;
图2为本申请一实施例的客户类别分析方法的流程示意图;
图3为本申请一实施例的上述客户类别分析方法的步骤S2的具体流程示意图;
图4为本申请一实施例的客户类别分析装置的结构示意框图;
图5为本申请一实施例的客户类别分析装置的结构示意框图;
图6为本申请一实施例的第一推荐单元的结构示意框图;
图7为本申请一实施例的聚类单元的结构示意框图;
图8为本申请一实施例的客户类别分析装置的结构示意框图;
图9为本申请一实施例的客户类别分析装置的结构示意框图;
图10为本申请一实施例的计算机设备的结构示意框图。
本发明的最佳实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
参照图1,本申请实施例提出一种客户类别分析方法,包括步骤:
S1、分别获取多条通道上与新客户相关的数据;
S2、将每条通道获取的数据分别进行聚类处理,得到与所述多条通道一一对应的多组聚类数据;
S3、将多组聚类数据形成稀疏矩阵,并通过协同过滤方法将所述稀疏矩阵补齐,形成对应所述新客户的第一向量矩阵;
S4、将所述第一向量矩阵分别与预设的客户类别数据库中的多个第二向量矩阵进行相似度计算;其中,客户类别数据库中包括多个客户类别,以及与客户类别一一对应的第二向量矩阵;
S5、将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别。
如上述步骤S1所述,上述多条通道是指大于等于两条通道,上述通道是指获取数据的通道,比如游戏通道、网上交互通道、消费通道、社会交流通道等。获取各条通道数据的主要方式包括:购买数据、通过爬虫技术爬取数据等。本实施例中,选择四条通道,具体为游戏通道、网上交互通道(微信)、消费通道(支付宝)和社会交流通道(微博),游戏、微信和支付宝的通道数据在通过当事人的授权后可以进行购买,微博数据可以通过爬虫技术获取,当然,微博数据也可以进行购买。上述游戏通道一般使用微信游戏作为游戏通道的数据,在其它实施例中,也可以使用网易游戏、盛大游戏等其它游戏作为游戏通道的数据。上述游戏通道的数据主要包括游戏消费数据、游戏时间数据等;上述微信通道的数据主要包括朋友圈的数据(包括发表朋友圈、朋友圈人数、长期互动的人数、发表朋友圈的内容,朋友圈中其他人发表的内容等);上述支付宝通道的数据主要包括消费记录数据、消费场所数据、消费类型数据等;微博通道的数据主要包括发表微博的内容、关注记录、关注微博发表的内容等。在其它实施例中,还可以获取与其它更多的通道数据,如交通工具选取、出差频率等的交通数据,餐饮消费、餐饮类型、餐饮时间等的餐饮通道数据等。
如上述步骤S2所述,上述聚类处理是分别对每条通道的数据进行聚类,聚类算法选择K-means聚类算法:初始化常熟K,随机选取初始点作为质心;将数据点归类到最近的中心点;重新计算质心;重复前两步直到质心不变。因为K-means聚类算法是一种现有的聚类算法,具体的聚类过程在此不在赘述。本申请使用K-means聚类算法,其算法快速、简单;对大数据集有较高的效率并且具有可伸缩性的,时间复杂度近于线性,而且适合挖掘大规模数据集。
如上述步骤S3所述,上述稀疏矩阵是指矩阵中非零元素的个数远远小于矩阵元素的总数,并且非零元素的分布没有规律。上述将多组数据进行聚类处理,得到的数据量降低,又因为数据种类和来源有一定的差别,所以,将各组的聚类结果代入到预设的矩阵后,各非零元素的分布没有规律且相关性较小,进而形成稀疏矩阵,具体地,上述数据包括四组,将一组聚类结果作为稀疏矩阵的第一行,其他三组聚类结果分别作为稀疏矩阵的第二、三、四行。上述协同过滤方法即为将上述稀疏矩阵中非零元素之间的空位进行补零处理,即因为各组数据的聚类结果不同,为了数据的对应,需要将非零元素之间添加零以补位,得到上述的第一向量矩阵。该第一向量矩阵中包含有上述多个通道的数据特征,进而,在后续使用过程,不会因为某一通道数据被篡改过而影响整体的判断。
如上述步骤S4所述,上述第二向量矩阵是根据历史客户预先整理出来的向量矩阵。因为历史客户的类型已经确认,那么各历史客户通过同样的上述通道获取的数据经过上述步骤S1-S3的过程同样可以生成上述的第一向量矩阵,只是其对应的客户类别是已知的,获取每一种客户类别的历史客户对应的多个第一向量矩阵,然后对同一客户类别对应的多个第一向量矩阵进行平均处理即会得到对应客户类别的第二向量矩阵。上述历史客户的客户类别的分类采用学习向量量化进行分类,具体过程如下:
(1)带标签的初始化向量,D={(x 1,y 1),(x 2,y 2),...,(x m,y m)},其中,D是采集的样本集,x、y 分别代表样本点;
(2)初始化的向量标记t i,t是原始向量的标记;
(3)每个样本的n个特征描述:x j=(x j1,x j2,...,x jn),y j∈Y,j=1,2,...,m,其中,X ij代表样本点的特征
(4)LVQ的学习目标是得到k个向量:q 1,q 2,....,q k,其中,q表示各个学习目标;
(5)对向量初始化,满足y j=t j的样本作为q j的初始值;
(6)从D中任选样本x j,找到最近的向量q i;如果y j与t i相等,那么q′=q i+η(x j-q i),否者,q′=q i-η(x j-q i);其中η是参数;
(7)、更新向量:q i=q′;
(8)通过最大迭代次数或向量更新阈值判断是否停止迭代;
(9)得到向量之后,每个向量对应一片区域,区域内样本点就是属于向量的类,进而,得到客户类别。
本申请中,因为上述第一向量矩阵是针对待分类型的客户的向量举证,所以需要计算与预设的第二向量矩阵相似度,寻找相似度最高的第二向量矩阵,相似度的计算可以使用欧几里得距离(Eucledian Distance)、曼哈顿距离(Manhattan Distance)、明可夫斯基距离(Minkowski distance)或者余弦相似度中一种算法进行计算。
如上述步骤S5所述,标记上述新客户的客户类别,以便于后续推荐产品给新客户等使用。
参照图2,在一个实施例中,上述将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别的步骤S5之后,包括:
S6、查找与所述新客户的客户类别对应的产品信息;
S7、将所述产品信息推荐给所述新客户。
如上述步骤S6和S7所述,即为根据新客户的类型给新客户推荐产品信息,以提高推荐转销售的效率。本申请中,客户类别对应的产品信息即为该客户类别购买较多的产品的产品信息,因为,哪一种客户类别购买的哪几种产品的数量较多,是有数据记录的,所以可以容易的得到新客户对应的产品信息。本申请中,查找新客户的客户类别对应的产品信息的方法具体包括:(1)在预设的数据库中查找新客户对应的客户类别的客户所购买过的全部产品记录;(2)然后在产品记录中查找符合要求的产品信息,该符合要求是指购买数量按照从多到少的顺序排列,排名在指定名次之前的产品信息;(3)将上述符合要求的产品信息推荐给上述新用户,推荐的方式包括电子邮件、微信、短信等。
在一个实施例中,上述将所述产品信息推荐给所述新客户的步骤S7,包括:
S71、将所述产品信息形成图表形式推荐给所述新客户,其中,所述图表形式包括产品信息的产品的文字介绍,以及产品的销售数据图。
如上述步骤S71所述,上述产品的销售数据图可以为直方图、曲线图、面积图等可以表示数据的图形。推荐给新客户的产品信息中可能包括多个产品,不同的产品的销售数据存在差异,那么将销售数据可视化后,新客户可以直观地分辨出那个产品的销量最高等,提高新客户查看推荐内容的效率。
参照图3,在一个实施例中,上述将多组聚类数据形成稀疏矩阵,并通过协同过滤方法将所述稀疏矩阵补齐,形成对应所述新客户的第一向量矩阵的步骤S2,包括:
S21、对通过每条通道获取的数据分别进行特征提取,得到每条通道对应的多个特征数据;
S22、将每条通道对应的多个特征数据中与其它特征数据不相关的特征数据提取出来,作为不相关特征数据;
S23、将每条通道对应的所述不相关特征数据对应的数据清除,并对每条通道对应的、留下的数据分别进行聚类处理,得到与所述多条通道一一对应的多组聚类数据。
如上述步骤S21、S22和S23所述,将通过每条通道获取的数据分别进行特征提取,得到对应各条通道的多组特征数据,每一组特征数据包括多个;然后对每一组特征数据分别进行相关性分析,以找到每一组特征数据中与其它特征数据不相关的特征数据,并将该特征数据记为不相关特征数据,因为与其它特征数据不相关,所以不相关特征数据对应的数据可能存在问题,所以将可能存在问题的数据提前清除掉,以提高后续聚类的结果的准确性。本申请中,数据的特征提取可以使用ReliefF算法,ReliefF算法是1994年Kononeill在Relief算法(Relief算法是一种特征权重算法(Feature weighting algorithms),根据各个特征和类别的相关性赋予特征不同的权重,权重小于某个阈值的特征将被移除)上进行改进而得到的算法,其相对于Relief算法而言,可以处理多类别问题,因为ReliefF算法是一种已知的算法,因此不再赘述数据特征提取的过程。
在一个实施例中,上述将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别的步骤S5之后,包括:
S501、获取所述新客户的医疗数据;
S502、根据所述医疗数据选取所述新客户适合购买的保险产品的保险产品信息;
S503、在所述适合购买的保险产品的保险产品信息中,筛选出对应所述新客户的客户类别的保险产品信息推荐给所述新客户。
如上述步骤S501、S502和S503所述,主要用于保险产品售卖场景,上述医疗数据主要包括新客户在医院的社保卡使用数据及其电子病例数据,通过医疗数据可以初步判定新客户的身体情况。根据其身体情况先在保险产品信息库中筛选出适合购买的保险产品,然后再在适合购买的保险产品信息中查找出对应客户类别的保险产品信息,并将查找出的保险产品信息推荐给新客户。上述根据其身体情况先在保险产品信息库中筛选出适合购买的保险产品是指,因为不同的身体情况,有不同的保险产品不能购买,比如,通过上述医疗数据已经判断出新客户患有某一疾病,而某保险产品恰好含有针对该疾病的保险, 所以含有针对该疾病的保险的保险产品不适用于所述新客户,而不含有针对该疾病的保险的保险产品则可能适合新客户购买。在本申请中,推荐给新客户的保险产品信息可以是针对该客户类别售卖最多的保险产品对应的保险产品信息,也可以是针对该客户类别售卖数量排名(售卖数量越多,其排名越靠前)在指定名次之前的保险产品对应的保险产品信息等。
在一个实施例中,上述根据所述医疗数据选取所述新客户适合购买的保险产品的保险产品信息的步骤S502,包括:
S5021、将所述医疗数据进行特征提取以得到多个医疗特征数据;
S5022、在多个医疗特征数据中提取出与其它医疗特征数据不相关的特征数据作为不相关医疗特征数据;
S5023、将所述不相关医疗特征数据对应的医疗数据清除,并根据留校的医疗数据选取所述新客户适合购买的保险产品的保险产品信息。
如上述步骤S5021、S5022和S5023所述,医疗数据中可能存在骗保的相关数据,而这些数据一般与常规的数据存在一定的差异性,比如,故意利用社保卡购买药物然后将药物贩卖给其它商城的,其社保卡的刷卡频率、刷卡金额都有一定的规律,如每次购买的药物不同,但是金额相当,每间隔一定时间就会进行刷卡买药等。这些医疗数据的相关性较低,所以通过对其特征数据进行相关性分析时,既可以将其提取出来作为不相关医疗特征数据,然后将不相关医疗特征数据对应的医疗数据清除,利用保留下的医疗数据判断新客户可以购买的保险产品的保险产品信息。
在其它实施例中,还可以通过获取新客户的人脸特征,然后将人脸特征输入到预设的不同的疾病预判模型中(疾病预判模型是根据大量的不同的人的人脸特征、以及各人脸特征对应的同一种疾病训练而得模型,当输入新的人脸特征后,会输出对应该人脸特征是否患有该疾病的结果)判断,确定新客户是否患有对应的疾病,进而提供适配的保险产品信息给新客户选择等。
在另一个可能实施例中,上述将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别的步骤S5之后,包括:
S511、获取所述新客户的征信数据;
S512、根据所述征信数据选取所述新客户适合申请的信贷产品的信贷产品信息;
S513、在所述适合购买的信贷产品的信贷产品信息中筛选出对应所述客户类别的信贷产品信息推荐给所述新客户。
如上述步骤S511、S512和S513所述,主要用于金融贷款的场景,上述信贷产品包括小额贷款、抵押贷款、购房贷款等;上述征信数据是指新客户在银行系统中的信誉度,比如,新客户多次未按时还信用卡,其信誉值较低,可能无法进行大额的抵押贷款等;如果新客户长期使用信用卡,但是每个月均按时还款,其信誉值较高,可以进行大额度贷款;如果新客户没有使用信用卡等,其信誉值是初始值,则 考虑贷款额度适中的信贷产品等。本申请中,先判断出新客户可以申请的信贷产品,然后再在可以申请的信贷产品选择对应客户类别的信贷产品,大大地提高了推荐的效果,方便新客户准确选择其可以申请的信贷产品。
本申请实施例的客户类别分析方法,获取新客户的多条通道对应的数据进行分析,确保对每个人的评估更加准确,而且多条通道评估可以更加全面的评估新客户,避免因为单通道个人信息作假而产生评估偏差,准确得出新客户的客户类别,进而推荐适合新客户的产品,以提高推荐转售卖的效率。
参照图4,本申请实施例提出一种客户类别分析装置,包括步骤:
获取单元10,用于分别获取多条通道上与新客户相关的数据;
聚类单元20,用于将每条通道获取的数据分别进行聚类处理,得到与所述多条通道一一对应的多组聚类数据;
向量化单元30,用于将多组聚类数据形成稀疏矩阵,并通过协同过滤方法将所述稀疏矩阵补齐,形成对应所述新客户的第一向量矩阵;
计算单元40,用于将所述第一向量矩阵分别与预设的客户类别数据库中的多个第二向量矩阵进行相似度计算;其中,客户类别数据库中包括多个客户类别,以及与客户类别一一对应的第二向量矩阵;
选择单元50,用于将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别。
在上述获取单元10中,上述多条通道是指大于等于两条通道,上述通道是指获取数据的通道,比如游戏通道、网上交互通道、消费通道、社会交流通道等。获取各条通道数据的主要方式包括:购买数据、通过爬虫技术爬取数据等。本实施例中,选择四条通道,具体为游戏通道、网上交互通道(微信)、消费通道(支付宝)和社会交流通道(微博),游戏、微信和支付宝的通道数据在通过当事人的授权后可以进行购买,微博数据可以通过爬虫技术获取,当然,微博数据也可以进行购买。上述游戏通道一般使用微信游戏作为游戏通道的数据,在其它实施例中,也可以使用网易游戏、盛大游戏等其它游戏作为游戏通道的数据。上述游戏通道的数据主要包括游戏消费数据、游戏时间数据等;上述微信通道的数据主要包括朋友圈的数据(包括发表朋友圈、朋友圈人数、长期互动的人数、发表朋友圈的内容,朋友圈中其他人发表的内容等);上述支付宝通道的数据主要包括消费记录数据、消费场所数据、消费类型数据等;微博通道的数据主要包括发表微博的内容、关注记录、关注微博发表的内容等。在其它实施例中,还可以获取与其它更多的通道数据,如交通工具选取、出差频率等的交通数据,餐饮消费、餐饮类型、餐饮时间等的餐饮通道数据等。
在上述聚类单元20中,上述聚类处理是分别对每条通道的数据进行聚类,聚类算法选择K-means聚类算法:初始化常熟K,随机选取初始点作为质心;将数据点归类到最近的中心点;重新计算质心;重复前两步直到质心不变。因为K-means聚类算法是一种现有的聚类算法,具体的聚类过程在此不在赘 述。本申请使用K-means聚类算法,其算法快速、简单;对大数据集有较高的效率并且具有可伸缩性的,时间复杂度近于线性,而且适合挖掘大规模数据集。
在上述向量化单元30中,上述稀疏矩阵是指矩阵中非零元素的个数远远小于矩阵元素的总数,并且非零元素的分布没有规律。上述将多组数据进行聚类处理,得到的数据量降低,又因为数据种类和来源有一定的差别,所以,将各组的聚类结果代入到预设的矩阵后,各非零元素的分布没有规律且相关性较小,进而形成稀疏矩阵,具体地,上述数据包括四组,将一组聚类结果作为稀疏矩阵的第一行,其他三组聚类结果分别作为稀疏矩阵的第二、三、四行。上述协同过滤方法即为将上述稀疏矩阵中非零元素之间的空位进行补零处理,即因为各组数据的聚类结果不同,为了数据的对应,需要将非零元素之间添加零以补位,得到上述的第一向量矩阵。该第一向量矩阵中包含有上述多个通道的数据特征,进而,在后续使用过程,不会因为某一通道数据被篡改过而影响整体的判断。
在上述计算单元40中,上述第二向量矩阵是根据历史客户预先整理出来的向量矩阵。因为历史客户的类型已经确认,那么各历史客户通过同样的上述通道获取的数据经过上述获取单元10、聚类单元20和向量化单元30执行的任务过程同样可以生成上述的第一向量矩阵,只是其对应的客户类别是已知的,获取每一种客户类别的历史客户对应的多个第一向量矩阵,然后对同一客户类别对应的多个第一向量矩阵进行平均处理即会得到对应客户类别的第二向量矩阵。上述历史客户的客户类别的分类采用学习向量量化进行分类,具体过程如下:
(1)带标签的初始化向量,D={(x 1,y 1),(x 2,y 2),...,(x m,y m)},其中,D是采集的样本集,x、y分别代表样本点;
(2)初始化的向量标记t i,t是原始向量的标记;
(3)每个样本的n个特征描述:x j=(x j1,x j2,...,x jn),y j∈Y,j=1,2,...,m,其中,X ij代表样本点的特征
(4)LVQ的学习目标是得到k个向量:q 1,q 2,....,q k,其中,q表示各个学习目标;
(5)对向量初始化,满足y j=t j的样本作为q j的初始值;
(6)从D中任选样本x j,找到最近的向量q i;如果y j与t i相等,那么q′=q i+η(x j-q i),否者,q′=q i-η(x j-q i);其中η是参数;
(7)、更新向量:q i=q′;
(8)通过最大迭代次数或向量更新阈值判断是否停止迭代;
(9)得到向量之后,每个向量对应一片区域,区域内样本点就是属于向量的类,进而,得到客户类别。
本申请中,因为上述第一向量矩阵是针对待分类型的客户的向量举证,所以需要计算与预设的第二向量矩阵相似度,寻找相似度最高的第二向量矩阵,相似度的计算可以使用欧几里得距离(Eucledian  Distance)、曼哈顿距离(Manhattan Distance)、明可夫斯基距离(Minkowski distance)或者余弦相似度中一种算法进行计算。
在上述选择单元50中,标记上述新客户的客户类别,以便于后续推荐产品给新客户等使用。
参照图5,在一个实施例中,上述客户类别分析装置还包括:
查找单元60,用于查找与所述新客户的客户类别对应的产品信息;
第一推荐单元70,用于将所述产品信息推荐给所述新客户。
在上述查找单元60和第一推荐单元70中,即为执行根据新客户的类型给新客户推荐产品信息,以提高推荐转销售的效率的装置。本申请中,客户类别对应的产品信息即为该客户类别购买较多的产品的产品信息,因为,哪一种客户类别购买的哪几种产品的数量较多,是有数据记录的,所以可以容易的得到新客户对应的产品信息。本申请中,查找新客户的客户类别对应的产品信息的方法具体包括:(1)在预设的数据库中查找新客户对应的客户类别的客户所购买过的全部产品记录;(2)然后在产品记录中查找符合要求的产品信息,该符合要求是指购买数量按照从多到少的顺序排列,排名在指定名次之前的产品信息;(3)将上述符合要求的产品信息推荐给上述新用户,推荐的方式包括电子邮件、微信、短信等。
参照图6,在一个实施例中,上述第一推荐单元70,包括:
图表推荐模块71,用于将所述产品信息形成图表形式推荐给所述新客户,其中,所述图表形式包括产品信息的产品的文字介绍,以及产品的销售数据图。
在上述图表推荐模块71中,上述产品的销售数据图可以为直方图、曲线图、面积图等可以表示数据的图形。推荐给新客户的产品信息中可能包括多个产品,不同的产品的销售数据存在差异,那么将销售数据可视化后,新客户可以直观地分辨出那个产品的销量最高等,提高新客户查看推荐内容的效率。
参照图7,在一个实施例中,上述聚类单元20,包括:
第一特征提取模块21,用于对通过每条通道获取的数据分别进行特征提取,得到每条通道对应的多个特征数据;
第一相关分析模块22,用于将每条通道对应的多个特征数据中与其它特征数据不相关的特征数据提取出来,作为不相关特征数据;
第一清除聚类模块23,用于将每条通道对应的所述不相关特征数据对应的数据清除,并对每条通道对应的、留下的数据分别进行聚类处理,得到与所述多条通道一一对应的多组聚类数据。
在上述第一特征提取模块21、第一相关分析模块22和第一清除聚类模块23中,将通过每条通道获取的数据分别进行特征提取,得到对应各条通道的多组特征数据,每一组特征数据包括多个;然后对每一组特征数据分别进行相关性分析,以找到每一组特征数据中与其它特征数据不相关的特征数据,并将该特征数据记为不相关特征数据,因为与其它特征数据不相关,所以不相关特征数据对应的数据可能存 在问题,所以将可能存在问题的数据提前清除掉,以提高后续聚类的结果的准确性。本申请中,数据的特征提取可以使用ReliefF算法,ReliefF算法是1994年Kononeill在Relief算法(Relief算法是一种特征权重算法(Feature weighting algorithms),根据各个特征和类别的相关性赋予特征不同的权重,权重小于某个阈值的特征将被移除)上进行改进而得到的算法,其相对于Relief算法而言,可以处理多类别问题,因为ReliefF算法是一种已知的算法,因此不再赘述数据特征提取的过程。
参照图8,在一个实施例中,上述客户类别分析装置还包括:
医疗数据获取单元501,用于获取所述新客户的医疗数据;
第一选取单元502,用于根据所述医疗数据选取所述新客户适合购买的保险产品的保险产品信息;
第二推荐单元503,用于在所述适合购买的保险产品的保险产品信息中,筛选出对应所述新客户的客户类别的保险产品信息推荐给所述新客户。
在上述医疗数据获取单元501、第一选取单元502和第二推荐单元503中所执行的动作主要用于保险产品售卖场景,上述医疗数据主要包括新客户在医院的社保卡使用数据及其电子病例数据,通过医疗数据可以初步判定新客户的身体情况。根据其身体情况先在保险产品信息库中筛选出适合购买的保险产品,然后再在适合购买的保险产品信息中查找出对应客户类别的保险产品信息,并将查找出的保险产品信息推荐给新客户。上述根据其身体情况先在保险产品信息库中筛选出适合购买的保险产品是指,因为不同的身体情况,有不同的保险产品不能购买,比如,通过上述医疗数据已经判断出新客户患有某一疾病,而某保险产品恰好含有针对该疾病的保险,所以含有针对该疾病的保险的保险产品不适用于所述新客户,而不含有针对该疾病的保险的保险产品则可能适合新客户购买。在本申请中,推荐给新客户的保险产品信息可以是针对该客户类别售卖最多的保险产品对应的保险产品信息,也可以是针对该客户类别售卖数量排名(售卖数量越多,其排名越靠前)在指定名次之前的保险产品对应的保险产品信息等。
在一个实施例中,上述第一选取单元502,包括:
第二特征提取模块,用于将所述医疗数据进行特征提取以得到多个医疗特征数据;
第二相关分析模块,用于在多个医疗特征数据中提取出与其它医疗特征数据不相关的特征数据作为不相关医疗特征数据;
第二清除聚类模块,用于将所述不相关医疗特征数据对应的医疗数据清除,并根据留校的医疗数据选取所述新客户适合购买的保险产品的保险产品信息。
在上述第二特征提取模块、第二相关分析模块和第二清除聚类模块中,医疗数据中可能存在骗保的相关数据,而这些数据一般与常规的数据存在一定的差异性,比如,故意利用社保卡购买药物然后将药物贩卖给其它商城的,其社保卡的刷卡频率、刷卡金额都有一定的规律,如每次购买的药物不同,但是金额相当,每间隔一定时间就会进行刷卡买药等。这些医疗数据的相关性较低,所以通过对其特征数据进行相关性分析时,既可以将其提取出来作为不相关医疗特征数据,然后将不相关医疗特征数据对应的 医疗数据清除,利用保留下的医疗数据判断新客户可以购买的保险产品的保险产品信息。
在其它实施例中,还可以设置疾病判断单元,通过获取新客户的人脸特征,然后将人脸特征输入到预设的不同的疾病预判模型中(疾病预判模型是根据大量的不同的人的人脸特征、以及各人脸特征对应的同一种疾病训练而得模型,当输入新的人脸特征后,会输出对应该人脸特征是否患有该疾病的结果)判断,确定新客户是否患有对应的疾病,进而提供适配的保险产品信息给新客户选择等。
参照图9,在另一个可能实施例中,上述客户类别分析装置还包括:
征信数据获取单元511,用于获取所述新客户的征信数据;
第二选取单元512,用于根据所述征信数据选取所述新客户适合申请的信贷产品的信贷产品信息;
第三推荐单元513,用于在所述适合购买的信贷产品的信贷产品信息中筛选出对应所述客户类别的信贷产品信息推荐给所述新客户。
在上述征信数据获取单元511、第二选取单元512和第三推荐单元513中指定的动作主要用于金融贷款的场景,上述信贷产品包括小额贷款、抵押贷款、购房贷款等;上述征信数据是指新客户在银行系统中的信誉度,比如,新客户多次未按时还信用卡,其信誉值较低,可能无法进行大额的抵押贷款等;如果新客户长期使用信用卡,但是每个月均按时还款,其信誉值较高,可以进行大额度贷款;如果新客户没有使用信用卡等,其信誉值是初始值,则考虑贷款额度适中的信贷产品等。本申请中,先判断出新客户可以申请的信贷产品,然后再在可以申请的信贷产品选择对应客户类别的信贷产品,大大地提高了推荐的效果,方便新客户准确选择其可以申请的信贷产品。
本申请实施例的客户类别分析装置,获取新客户的多条通道对应的数据进行分析,确保对每个人的评估更加准确,而且多条通道评估可以更加全面的评估新客户,避免因为单通道个人信息作假而产生评估偏差,准确得出新客户的客户类别,进而推荐适合新客户的产品,以提高推荐转售卖的效率。
参照图10,本申请实施例中还提供一种计算机设备,该计算机设备可以是服务器,其内部结构可以如图10所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设计的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储各条通道获取的通道数据等。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种客户类别分析方法。
上述处理器执行上述客户类别分析方法的步骤为:分别获取多条通道上与新客户相关的数据;将每条通道获取的数据分别进行聚类处理,得到与所述多条通道一一对应的多组聚类数据;将多组聚类数据形成稀疏矩阵,并通过协同过滤方法将所述稀疏矩阵补齐,形成对应所述新客户的第一向量矩阵;将所 述第一向量矩阵分别与预设的客户类别数据库中的多个第二向量矩阵进行相似度计算;其中,客户类别数据库中包括多个客户类别,以及与客户类别一一对应的第二向量矩阵;将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别。
在一个实施例中,上述将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别的步骤之后,包括:查找与所述新客户的客户类别对应的产品信息;将所述产品信息推荐给所述新客户。
在一个实施例中,上述将所述产品信息推荐给所述新客户的步骤,包括:将所述产品信息形成图表形式推荐给所述新客户,其中,所述图表形式包括产品信息的产品的文字介绍,以及产品的销售数据图。
在一个实施例中,上述将多组聚类数据形成稀疏矩阵,并通过协同过滤方法将所述稀疏矩阵补齐,形成对应所述新客户的第一向量矩阵的步骤,包括:
对通过每条通道获取的数据分别进行特征提取,得到每条通道对应的多个特征数据;将每条通道对应的多个特征数据中与其它特征数据不相关的特征数据提取出来,作为不相关特征数据;将每条通道对应的所述不相关特征数据对应的数据清除,并对每条通道对应的、留下的数据分别进行聚类处理,得到与所述多条通道一一对应的多组聚类数据。
在一个实施例中,上述将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别的步骤之后,包括:获取所述新客户的医疗数据;根据所述医疗数据选取所述新客户适合购买的保险产品的保险产品信息;在所述适合购买的保险产品的保险产品信息中筛选出对应所述新客户的客户类别的保险产品信息推荐给所述新客户。
在一个实施例中,上述根据所述医疗数据选取所述新客户适合购买的保险产品的保险产品信息的步骤,包括:将所述医疗数据进行特征提取以得到多个医疗特征数据;在多个医疗特征数据中提取出与其它医疗特征数据不相关的特征数据作为不相关医疗特征数据;将所述不相关医疗特征数据对应的医疗数据清除,并根据留校的医疗数据选取所述新客户适合购买的保险产品的保险产品信息。
在一个实施例中,上述将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别的步骤之后,包括:获取所述新客户的征信数据;根据所述征信数据选取所述新客户适合申请的信贷产品的信贷产品信息;在所述适合购买的信贷产品的信贷产品信息中筛选出对应所述客户类别的信贷产品信息推荐给所述新客户。
本领域技术人员可以理解,图10中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定。
本申请实施例的计算机设备,获取新客户的多条通道对应的数据进行分析,确保对每个人的评估更加准确,而且多条通道评估可以更加全面的评估新客户,避免因为单通道个人信息作假而产生评估偏差,准确得出新客户的客户类别,进而推荐适合新客户的产品,以提高推荐转售卖的效率。
本申请一实施例还提供一种计算机非易失性可读存储介质,其上存储有计算机可读指令,计算机可读指令被处理器执行时实现如上述各方法的实施例的流程。
以上所述仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种客户类别分析方法,其特征在于,包括:
    分别获取多条通道上与新客户相关的数据;
    将每条通道获取的数据分别进行聚类处理,得到与所述多条通道一一对应的多组聚类数据;
    将多组聚类数据形成稀疏矩阵,并通过协同过滤方法将所述稀疏矩阵补齐,形成对应所述新客户的第一向量矩阵;
    将所述第一向量矩阵分别与预设的客户类别数据库中的多个第二向量矩阵进行相似度计算;其中,客户类别数据库中包括多个客户类别,以及与客户类别一一对应的第二向量矩阵;
    将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别。
  2. 根据权利要求1所述的客户类别分析方法,其特征在于,所述将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别的步骤之后,包括:
    查找与所述新客户的客户类别对应的产品信息;
    将所述产品信息推荐给所述新客户。
  3. 根据权利要求2所述的客户类别分析方法,其特征在于,所述将所述产品信息推荐给所述新客户的步骤,包括:
    将所述产品信息形成图表形式推荐给所述新客户,其中,所述图表形式包括产品信息的产品的文字介绍,以及产品的销售数据图。
  4. 根据权利要求1所述的客户类别分析方法,其特征在于,所述将多组聚类数据形成稀疏矩阵,并通过协同过滤方法将所述稀疏矩阵补齐,形成对应所述新客户的第一向量矩阵的步骤,包括:
    对通过每条通道获取的数据分别进行特征提取,得到每条通道对应的多个特征数据;
    将每条通道对应的多个特征数据中与其它特征数据不相关的特征数据提取出来,作为不相关特征数据;
    将每条通道对应的所述不相关特征数据对应的数据清除,并对每条通道对应的、留下的数据分别进行聚类处理,得到与所述多条通道一一对应的多组聚类数据。
  5. 根据权利要求1所述的客户类别分析方法,其特征在于,所述将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别的步骤之后,包括:
    获取所述新客户的医疗数据;
    根据所述医疗数据选取所述新客户适合购买的保险产品的保险产品信息;
    在所述适合购买的保险产品的保险产品信息中筛选出对应所述新客户的客户类别的保险产品信息推荐给所述新客户。
  6. 根据权利要求5所述的客户类别分析方法,其特征在于,所述根据所述医疗数据选取所述新客 户适合购买的保险产品的保险产品信息的步骤,包括:
    将所述医疗数据进行特征提取以得到多个医疗特征数据;
    在多个医疗特征数据中提取出与其它医疗特征数据不相关的特征数据作为不相关医疗特征数据;
    将所述不相关医疗特征数据对应的医疗数据清除,并根据留校的医疗数据选取所述新客户适合购买的保险产品的保险产品信息。
  7. 根据权利要求1所述的客户类别分析方法,其特征在于,所述将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别的步骤之后,包括:
    获取所述新客户的征信数据;
    根据所述征信数据选取所述新客户适合申请的信贷产品的信贷产品信息;
    在所述适合购买的信贷产品的信贷产品信息中筛选出对应所述客户类别的信贷产品信息推荐给所述新客户。
  8. 一种客户类别分析装置,其特征在于,包括:
    获取单元,用于分别获取多条通道上与新客户相关的数据;
    聚类单元,用于将每条通道获取的数据分别进行聚类处理,得到与所述多条通道一一对应的多组聚类数据;
    向量化单元,用于将多组聚类数据形成稀疏矩阵,并通过协同过滤方法将所述稀疏矩阵补齐,形成对应所述新客户的第一向量矩阵;
    计算单元,用于将所述第一向量矩阵分别与预设的客户类别数据库中的多个第二向量矩阵进行相似度计算;其中,客户类别数据库中包括多个客户类别,以及与客户类别一一对应的第二向量矩阵;
    选择单元,用于将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别。
  9. 根据权利要求8所述的客户类别分析装置,其特征在于,所述客户类别分析装置还包括:
    查找单元,用于查找与所述新客户的客户类别对应的产品信息;
    第一推荐单元,用于将所述产品信息推荐给所述新客户。
  10. 根据权利要求9所述的客户类别分析装置,其特征在于,所述第一推荐单元,包括:
    图表推荐模块,用于将所述产品信息形成图表形式推荐给所述新客户,其中,所述图表形式包括产品信息的产品的文字介绍,以及产品的销售数据图。
  11. 根据权利要求8所述的客户类别分析装置,其特征在于,所述聚类单元,包括:
    第一特征提取模块,用于对通过每条通道获取的数据分别进行特征提取,得到每条通道对应的多个特征数据;
    第一相关分析模块,用于将每条通道对应的多个特征数据中与其它特征数据不相关的特征数据提取 出来,作为不相关特征数据;
    第一清除聚类模块,用于将每条通道对应的所述不相关特征数据对应的数据清除,并对每条通道对应的、留下的数据分别进行聚类处理,得到与所述多条通道一一对应的多组聚类数据。
  12. 根据权利要求8所述的客户类别分析装置,其特征在于,所述客户类别分析装置还包括:
    医疗数据获取单元,用于获取所述新客户的医疗数据;
    第一选取单元,用于根据所述医疗数据选取所述新客户适合购买的保险产品的保险产品信息;
    第二推荐单元,用于在所述适合购买的保险产品的保险产品信息中筛选出对应所述新客户的客户类别的保险产品信息推荐给所述新客户。
  13. 根据权利要求12所述的客户类别分析装置,其特征在于,所述第一选取单元,包括:
    第二特征提取模块,用于将所述医疗数据进行特征提取以得到多个医疗特征数据;
    第二相关分析模块,用于在多个医疗特征数据中提取出与其它医疗特征数据不相关的特征数据作为不相关医疗特征数据;
    第二清除聚类模块,用于将所述不相关医疗特征数据对应的医疗数据清除,并根据留校的医疗数据选取所述新客户适合购买的保险产品的保险产品信息。
  14. 根据权利要求8所述的客户类别分析装置,其特征在于,所述客户类别分析装置还包括:
    征信数据获取单元,用于获取所述新客户的征信数据;
    第二选取单元,用于根据所述征信数据选取所述新客户适合申请的信贷产品的信贷产品信息;
    第三推荐单元,用于在所述适合购买的信贷产品的信贷产品信息中筛选出对应所述客户类别的信贷产品信息推荐给所述新客户。
  15. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现客户类别分析方法,该客户类别分析方法包括:
    分别获取多条通道上与新客户相关的数据;
    将每条通道获取的数据分别进行聚类处理,得到与所述多条通道一一对应的多组聚类数据;
    将多组聚类数据形成稀疏矩阵,并通过协同过滤方法将所述稀疏矩阵补齐,形成对应所述新客户的第一向量矩阵;
    将所述第一向量矩阵分别与预设的客户类别数据库中的多个第二向量矩阵进行相似度计算;其中,客户类别数据库中包括多个客户类别,以及与客户类别一一对应的第二向量矩阵;
    将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别。
  16. 根据权利要求15所述的计算机设备,其特征在于,所述将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别的步骤之后,包括:
    查找与所述新客户的客户类别对应的产品信息;
    将所述产品信息推荐给所述新客户。
  17. 根据权利要求16所述的计算机设备,其特征在于,所述将所述产品信息推荐给所述新客户的步骤,包括:
    将所述产品信息形成图表形式推荐给所述新客户,其中,所述图表形式包括产品信息的产品的文字介绍,以及产品的销售数据图。
  18. 根据权利要求15所述的计算机设备,其特征在于,所述将多组聚类数据形成稀疏矩阵,并通过协同过滤方法将所述稀疏矩阵补齐,形成对应所述新客户的第一向量矩阵的步骤,包括:
    对通过每条通道获取的数据分别进行特征提取,得到每条通道对应的多个特征数据;
    将每条通道对应的多个特征数据中与其它特征数据不相关的特征数据提取出来,作为不相关特征数据;
    将每条通道对应的所述不相关特征数据对应的数据清除,并对每条通道对应的、留下的数据分别进行聚类处理,得到与所述多条通道一一对应的多组聚类数据。
  19. 一种计算机非易失性可读存储介质,其上存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现客户类别分析方法,该客户类别分析方法包括:
    分别获取多条通道上与新客户相关的数据;
    将每条通道获取的数据分别进行聚类处理,得到与所述多条通道一一对应的多组聚类数据;
    将多组聚类数据形成稀疏矩阵,并通过协同过滤方法将所述稀疏矩阵补齐,形成对应所述新客户的第一向量矩阵;
    将所述第一向量矩阵分别与预设的客户类别数据库中的多个第二向量矩阵进行相似度计算;其中,客户类别数据库中包括多个客户类别,以及与客户类别一一对应的第二向量矩阵;
    将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别。
  20. 根据权利要求19所述的计算机非易失性可读存储介质,其特征在于,所述将与所述第一向量矩阵相似度最高的第二向量矩阵对应的客户类别记为所述新客户的客户类别的步骤之后,包括:
    查找与所述新客户的客户类别对应的产品信息;
    将所述产品信息推荐给所述新客户。
PCT/CN2018/095482 2018-05-25 2018-07-12 客户类别分析方法、装置、计算机设备和存储介质 WO2019223082A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810545793.3 2018-05-25
CN201810545793.3A CN108876444A (zh) 2018-05-25 2018-05-25 客户类别分析方法、装置、计算机设备和存储介质

Publications (1)

Publication Number Publication Date
WO2019223082A1 true WO2019223082A1 (zh) 2019-11-28

Family

ID=64335870

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/095482 WO2019223082A1 (zh) 2018-05-25 2018-07-12 客户类别分析方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN108876444A (zh)
WO (1) WO2019223082A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109636482B (zh) * 2018-12-21 2021-07-27 南京星云数字技术有限公司 基于相似度模型的数据处理方法及系统
CN116797253B (zh) * 2022-12-13 2024-03-01 乖乖数字科技(苏州)有限公司 一种基于客户资源的分类管理方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103198418A (zh) * 2013-03-15 2013-07-10 北京亿赞普网络技术有限公司 一种应用推荐方法和系统
US20160188734A1 (en) * 2014-12-30 2016-06-30 Socialtopias, Llc Method and apparatus for programmatically synthesizing multiple sources of data for providing a recommendation
CN105868334A (zh) * 2016-03-28 2016-08-17 云南财经大学 一种基于特征递增型的电影个性化推荐方法及系统
CN106951489A (zh) * 2017-03-13 2017-07-14 杭州师范大学 一种用于稀疏大数据的个性化推荐方法和装置
CN107256495A (zh) * 2017-05-27 2017-10-17 上海非码网络科技有限公司 基于多平台数据按标签划分顾客群的方法及系统、服务器

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103839041B (zh) * 2012-11-27 2017-07-18 腾讯科技(深圳)有限公司 客户端特征的识别方法和装置
US10275628B2 (en) * 2016-05-27 2019-04-30 Adobe Inc. Feature summarization filter with applications using data analytics

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103198418A (zh) * 2013-03-15 2013-07-10 北京亿赞普网络技术有限公司 一种应用推荐方法和系统
US20160188734A1 (en) * 2014-12-30 2016-06-30 Socialtopias, Llc Method and apparatus for programmatically synthesizing multiple sources of data for providing a recommendation
CN105868334A (zh) * 2016-03-28 2016-08-17 云南财经大学 一种基于特征递增型的电影个性化推荐方法及系统
CN106951489A (zh) * 2017-03-13 2017-07-14 杭州师范大学 一种用于稀疏大数据的个性化推荐方法和装置
CN107256495A (zh) * 2017-05-27 2017-10-17 上海非码网络科技有限公司 基于多平台数据按标签划分顾客群的方法及系统、服务器

Also Published As

Publication number Publication date
CN108876444A (zh) 2018-11-23

Similar Documents

Publication Publication Date Title
US20230013306A1 (en) Sensitive Data Classification
Antwarg et al. Explaining anomalies detected by autoencoders using Shapley Additive Explanations
Berk et al. Fairness in criminal justice risk assessments: The state of the art
US11615288B2 (en) Secure broker-mediated data analysis and prediction
US20210125732A1 (en) System and method with federated learning model for geotemporal data associated medical prediction applications
US11837061B2 (en) Techniques to provide and process video data of automatic teller machine video streams to perform suspicious activity detection
WO2021174944A1 (zh) 基于目标对象活跃度的消息推送方法及相关设备
Dhurandhar et al. Tip: Typifying the interpretability of procedures
Sowah et al. Decision support system (DSS) for fraud detection in health insurance claims using genetic support vector machines (GSVMs)
Islam et al. Discovering dynamic adverse behavior of policyholders in the life insurance industry
Changpetch et al. Selection of multinomial logit models via association rules analysis
Shabbir et al. Suspicious transaction detection in banking cyber–physical systems
CN108885673A (zh) 用于计算数据隐私-效用折衷的系统和方法
Bae et al. A personal credit rating prediction model using data mining in smart ubiquitous environments
Prasad Big data analytics made easy
Roy et al. Performance comparison of machine learning platforms
CN112991079B (zh) 多卡共现就医欺诈行为检测方法、系统、云端及介质
WO2019223082A1 (zh) 客户类别分析方法、装置、计算机设备和存储介质
Imakura et al. Collaborative novelty detection for distributed data by a probabilistic method
CN116029760A (zh) 消息推送方法、装置、计算机设备和存储介质
Islam An efficient technique for mining bad credit accounts from both olap and oltp
Zhang A novel data preprocessing solution for large scale digital forensics investigation on big data
US20230325778A1 (en) Systems and methods for predictive scoring
US20230052225A1 (en) Methods and computer systems for automated event detection based on machine learning
Srivastava et al. A hybrid intelligent model based on evolutionary fuzzy clustering and syndicate neural networks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18919617

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18919617

Country of ref document: EP

Kind code of ref document: A1