CN112990386B

CN112990386B - User value clustering method and device, computer equipment and storage medium

Info

Publication number: CN112990386B
Application number: CN202110531499.9A
Authority: CN
Inventors: 刘志伟
Original assignee: Taiping Financial Technology Services Shanghai Co Ltd Shenzhen Branch
Current assignee: Taiping Financial Technology Services Shanghai Co Ltd Shenzhen Branch
Priority date: 2021-05-17
Filing date: 2021-05-17
Publication date: 2021-08-03
Anticipated expiration: 2041-05-17
Also published as: CN112990386A

Abstract

The application relates to a user value clustering method, a user value clustering device, computer equipment and a storage medium. The method comprises the following steps: acquiring a user data set; the user data set comprises user data corresponding to more than one user; determining a current transaction index, a predicted transaction index and a potential transaction index according to the data of each user; determining the importance of the current transaction index, the predicted transaction index and the potential transaction index; determining a clustering index according to the importance; and clustering the users according to the clustering indexes to obtain various user value classification sets. By adopting the method, the accuracy of user value classification can be improved.

Description

User value clustering method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of big data technologies, and in particular, to a user value clustering method, apparatus, computer device, and storage medium.

Background

With the development of computer technology, traditional offline services are gradually shifted to online for processing, so that the amount of online data becomes more and more huge. For companies, it is becoming more and more important how to analyze and process huge online data to obtain valid data. For example, a company can divide users into different levels by analyzing online data, and then can execute business activities of corresponding levels on the users of different levels, thereby improving business efficiency.

In the traditional method, in the process of classifying a single user, the single user is judged according to a fixed judgment rule, and the relation between other user data of the user is not considered, so that the classification is not accurate.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a user value clustering method, apparatus, computer device and storage medium capable of improving accuracy of user value classification.

A user value clustering method comprises the following steps:

acquiring a user data set; the user data set comprises user data corresponding to more than one user;

determining a current transaction index, a predicted transaction index and a potential transaction index according to the data of each user;

determining the importance of the current transaction index, the predicted transaction index and the potential transaction index;

determining a clustering index according to the importance;

and clustering the users according to the clustering indexes to obtain various user value classification sets.

A user value clustering apparatus, the apparatus comprising:

the acquisition module is used for acquiring a user data set; the user data set comprises user data corresponding to more than one user;

the index determining module is used for determining a current transaction index, a predicted transaction index and a potential transaction index according to the data of each user;

the relevancy determination module is used for determining the importance of the current transaction index, the predicted transaction index and the potential transaction index;

a cluster determination module; determining a clustering index according to the importance;

and the clustering module is used for clustering the users according to the clustering indexes to obtain various user value classification sets.

The user value clustering method, the user value clustering device, the computer equipment and the storage medium acquire a user data set; the user data set includes user data corresponding to more than one user. And then, the current transaction index, the predicted transaction index and the potential transaction index are determined according to the user data, so that the multi-dimensional transaction index is determined according to the user data of a plurality of users, and the transaction index is more accurately determined. Then, the importance of the current transaction index, the predicted transaction index and the potential transaction index is determined, and the clustering index is determined according to the importance. And clustering the users according to the clustering indexes to obtain various user value classification sets. The data analysis is carried out on the data of the users, multidimensional transaction indexes are determined according to the characteristics of the users, and user clustering is carried out according to the characteristics of the multidimensional transaction indexes. In the process of clustering the users, various data are comprehensively considered, so that the clustering result is more consistent with the characteristics of all the users in the service scene, and the user clustering is more accurate.

A customer service request processing method comprises the following steps: receiving a customer service request, wherein the customer service request carries customer service data; calculating to obtain a customer service value category set by a user value clustering method according to the customer service data; acquiring a service strategy corresponding to the customer service value category set; and processing the customer service request according to the service policy.

A customer service request processing apparatus, the apparatus comprising:

the receiving module is used for receiving a customer service request, and the customer service request carries customer service data;

the calculation module is used for calculating to obtain a customer service value category set through a user value clustering method according to the customer service data;

the strategy acquisition module is used for acquiring a service strategy corresponding to the customer service value category set;

and the processing module is used for processing the customer service request according to the service strategy.

A computer device comprising a memory storing a computer program and a processor implementing the steps of the method when the processor executes the computer program.

A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.

According to the customer service request processing method, the class value of the current customer service is divided according to the class set of the customer service value determined by the plurality of user data, the class value of the current customer service is obtained, the corresponding service strategy is distributed to the value customer service of the corresponding class, and the effect of self-adapting to self-adapting service for the customer services of different levels is achieved.

Drawings

FIG. 1 is a diagram of an application environment of a user value clustering method in one embodiment;

FIG. 2 is a flow chart illustrating a user value clustering method according to an embodiment;

fig. 3 is a graph illustrating cluster distortion value variation curves corresponding to different cluster numbers in an embodiment;

FIG. 4 is a schematic diagram illustrating a distribution of cluster centers at different levels in an embodiment;

FIG. 5 is a block diagram illustrating an exemplary embodiment of a user value clustering device;

FIG. 6 is a block diagram illustrating an exemplary embodiment of a customer service request processing apparatus;

FIG. 7 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The user value clustering method provided by the application can be applied to the application environment shown in figure 1. Wherein the terminal 110 communicates with the server 120 through a network. The server 120 obtains a user data set; the user data set comprises user data corresponding to more than one user; the server 120 determines a current transaction index, a predicted transaction index, and a potential transaction index according to each user data; determining the importance of the current transaction index, the predicted transaction index and the potential transaction index; determining a clustering index according to the importance; and clustering the users according to the clustering indexes to obtain various user value classification sets. And pushes a plurality of user value category sets to the user terminal 110, so that the user terminal 110 can push the service of the corresponding category to the user according to the user category. The terminal 110 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 120 may be implemented by an independent server or a server cluster formed by a plurality of servers.

In one embodiment, as shown in fig. 2, a user value clustering method is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:

step S202, acquiring a user data set; the user data set includes user data corresponding to more than one user.

The user data set comprises user data corresponding to a plurality of users. Also, the user data may be corresponding data over a historical period of time. The user data may include attribute information of the user, historical transaction data of the user, historical behavior information of the user, and the like. It is understood that the user attribute information may be the name, gender, and geographic location of the user. The user's historical transaction data may be a running line of transactions that the user generates during the transaction, such as the products purchased by the user, the frequency with which the products are purchased, the price at which the products are purchased, and the like. The behavior information of the user may be the behavior of the user in a transaction scenario or in other non-transaction scenarios, such as the behavior of whether the transaction of the user is successful or not.

Specifically, the server crawls user data of a plurality of users from the business system and generates a user data set. Further, the server can also perform cleaning processing on the crawled user data, such as removing error data in the user data or data which does not conform to a standard format, so that the user data conforms to subsequent data processing requirements, and the accuracy of data processing is improved. And the server can also perform normalization processing on the crawled user data so as to enable data calculation among data of different dimensions.

Step S204, determining the current transaction index, the predicted transaction index and the potential transaction index according to the user data.

The current transaction index is an index corresponding to the user in the current time period, and the number of the current transaction indexes may be one or more. Such as current transaction metrics may include current purchase intervals, current frequency of purchases by the user, current successive years of purchases, current premium, and current profit, etc.

The predicted transaction indexes are corresponding indexes in a future time period, and the number of the predicted transaction indexes can be one or more. It is to be understood that the predicted trading index is an index corresponding to a current trading index, and that a future trading index can be predicted in the server based on the current trading index. For example, the predicted transaction index may be the development purchase interval, development purchase frequency, development continuous purchase age, development premium, development profit, and the like of the product by the user.

Specifically, the predicted transaction index may be obtained in the server by using a pre-trained index prediction model, or may be obtained according to a pre-constructed index prediction formula. The obtaining manner of the predicted transaction index is not limited, as long as the predicted transaction index can be obtained according to the historical transaction index.

The potential transaction indexes are indexes obtained after potential values of the users are mined in the server, and the number of the potential transaction indexes can be one or more. It will be appreciated that the potential transaction indicators may be data that is not directly embodied in the transaction data of the user, but which can have an effect on the transaction of the user. If the potential transaction index can be the asset information of the user, the more assets of the user, the more transaction probability of the corresponding user is generally. For example, the property information may include house property value, vehicle value, monthly income, premium income proportion, and the like.

Specifically, the server may retrieve relevant data related to the user according to the user attribute information in the user data, and determine the potential transaction index of the user according to the relevant data. In one embodiment, the relevant data may be transaction data or behavior data of the user in other business processes, and future behavior of the user may be predicted based on the relevant data to determine the potential value of the user.

In this step, the transaction index is determined according to the user data of a plurality of users in the user data set, rather than determining the user index according to single user data, so that the user index is obtained more comprehensively, the obtained user index can represent the user characteristics of most users more comprehensively, and the user index has universality and greater applicability. And then the accuracy of value clustering on the user subsequently is improved.

Step S206, determining the importance of the current transaction index, the predicted transaction index and the potential transaction index.

The importance level is used to indicate the importance level (influence level) of each transaction index to the value result. Specifically, the importance of the current transaction index, the predicted transaction index, and the potential transaction index to the value result may be used. It can be understood that the greater the importance (influence degree) of the trading index on the value result represents the greater the role of the trading index in the user value clustering process.

In the present application, the three-dimensional transaction indexes and the sub-transaction indexes in the three-dimensional transaction indexes are respectively: the current purchase interval, the current purchase frequency, the current successive purchase age, the current premium and the current profit in the current trading index, and the developing purchase interval, the developing purchase frequency, the developing successive purchase age, the developing premium and the developing profit in the predicted trading index, and the real estate value, the vehicle value, the monthly income and the premium income duty in the potential trading index. Furthermore, the server also simplifies index variables with obvious collinearity in the 14 sub-transaction indexes, so that the training speed of a subsequent model is increased, the accuracy of the model is improved, and the possibility of overfitting is reduced.

Specifically, the server extracts each sub-transaction index from the current transaction index, the predicted transaction index or the potential transaction index, then calculates the importance of each sub-transaction index, and determines the clustering index according to the value corresponding to the importance. In one embodiment, the importance of each sub-transaction index may be predetermined. Specifically, the server acquires user data corresponding to a plurality of users in advance, takes a data set formed by the user data as a training data set, determines sub-transaction indexes according to the training data set, acquires real value grades corresponding to the users, and calculates importance degrees of the sub-transaction indexes relative to the real value grades respectively. Further, the sub-transaction indexes and the importance degrees are correspondingly stored in a database. In the specific process of determining the user value category sets, the importance of each sub-transaction index can be directly obtained from the database, the clustering indexes are further determined according to the importance, and the users are divided into various user value category sets according to the clustering indexes.

Specifically, based on the prior experience of the business department, the value of each user is manually scored according to the user data corresponding to each user to obtain the user value level y_i. Specifically, for a given n user sample points, let the user sample corresponding to the first user be (x)₁…x_m，y₁) Wherein x is₁…x_mFor all sub-transaction indices, y₁Is the calculated user value level for each user. To make the calculated user value level y and the manually scored user value level_iThe difference of (a) is minimal, essentially solving for the coefficient β of each sub-transaction index, such that the calculated user value level and the manually scored user value level y are such that_iThe difference of (a) is minimal. Specifically, for each sub-transaction index, the calculation process of the correlation coefficient is shown in formula (1).

（1）

Further, in order to improve the accuracy of the calculation of the correlation coefficient, a penalty coefficient λ may also be introduced in formula (1), and specifically, for each sub-transaction index, the calculation process of the correlation coefficient is as shown in formula (2).

（2）

The λ is a penalty coefficient set manually, and when the λ is larger, the penalty degree on the high-dimensional features is larger.

Specifically, the sub-transaction indexes of each user are substituted into the above formula (1) or (2), so that the parameter coefficients of the sub-transaction indexes relative to the value result can be obtained, wherein the numerical values of the parameter coefficients represent the importance of the sub-transaction indexes to the value result. Furthermore, a parameter coefficient table is generated according to each sub-transaction index and the corresponding parameter coefficient (importance) and stored in the server. When the user value is specifically calculated, a pre-stored parameter coefficient table can be obtained from the server, and a sub-transaction index with a parameter coefficient not being 0 is selected from the parameter coefficient table to be used as a clustering index of an input clustering model k-means model, or a sub-transaction index with a parameter coefficient being larger than a preset value can be selected to be used as a clustering index of an input clustering model, and the application is not limited.

In the step, the clustering indexes are extracted from the current transaction indexes, the predicted transaction indexes and the potential transaction indexes according to the numerical values of the importance degrees, so that the index correlation among the clustering indexes is small, more information can be represented for users, and the redundancy among the indexes is reduced. And then clustering the user value by using the clustering index, thereby improving the efficiency and accuracy of user value clustering.

And step S208, determining a clustering index according to the importance.

The clustering index is a corresponding index when value clustering is performed on the user. It is to be understood that the cluster indicator is a transaction indicator of at least one of the current transaction indicator, the predicted transaction indicator, and the potential transaction indicator extracted according to importance. And the index correlation degree between different clustering indexes is small, and different clustering indexes can express user information in different aspects.

And step S210, clustering the users according to the clustering indexes to obtain various user value classification sets.

The set of user value categories includes a plurality of levels of sets of users. Specifically, each user is classified into different levels in the server according to the user data of each user, and the users of different levels are classified into different value categories to form a user value category set corresponding to a plurality of user levels.

Specifically, the server extracts a clustering index corresponding to each user from the user data according to the user data corresponding to each user, and clusters the users according to the clustering index of each user to obtain the user value level of each user. And dividing users belonging to the same user value level into the same user value category, and further forming a user value category set according to a plurality of user value categories.

In the step, in the process of clustering the single user, the single user is clustered by using the clustering index determined according to the user data set, and the correlation and the distinctiveness among different user data are considered in the process of clustering the user, so that the accuracy of clustering the user is higher.

In the above embodiment, a user data set is obtained; the user data set includes user data corresponding to more than one user. And then determining the current transaction index, the predicted transaction index and the potential transaction index according to the user data of each user, and determining the transaction index in a multidimensional way according to the user data of a plurality of users, so that the determination of the transaction index is more accurate. Then, the importance of the current transaction index, the predicted transaction index and the potential transaction index is determined, and the clustering index is determined according to the importance. And clustering the users according to the clustering indexes to obtain various user value classification sets. The data analysis is carried out on the data of the users, multidimensional transaction indexes are determined according to the characteristics of the users, and user clustering is carried out according to the characteristics of the multidimensional transaction indexes. In the process of clustering the users, various data are comprehensively considered, so that the clustering result is more consistent with the characteristics of all the users in the service scene, and the user clustering is more accurate.

In one embodiment, determining a current transaction index, a predicted transaction index, and a potential transaction index from the respective user data comprises: the existing value, the development value and the potential value of the user are determined according to the data of each user. Extracting current trading indexes from the existing value, extracting predicted trading indexes from the development value, and extracting potential trading indexes from the potential value.

The existing value is the existing value (Customer Current value CCV) of the user to the company, and is the corresponding value of the user in the Current time period. Specifically, the CCV is the value generated when the user has taken a business action, i.e. the policy purchased by the user, and the physiological claim service is concurrent. The user's existing value includes a current transaction index, which may be used to measure the user's existing value. For example, the current trade index CCV _ R represents the existing purchase interval, CCV _ F represents the existing purchase frequency, CCV _ C represents the existing continuous purchase age, CCV _ M represents the existing premium, and CCV _ P represents the existing profit.

In one embodiment, all the policies under the user's window term are sorted by the insurance onset in the server, and the last policy purchased by the user is retrieved. The current time is subtracted by the expiration date of the policy to obtain the user's current purchase interval (CCV _ R). All insurance policies which serve as insurance applicants in business are handled in the server within the window period of the client, and the number of insurance applications is calculated, and the number of insurance applications is used as the existing purchase frequency (CCV _ F). And taking out all insurance policies applied in the user window period from the server, and sequencing according to the insurance start period. And judging whether to continue the guarantee or not according to the difference value of the ending period E2 of the next guarantee slip and the starting period S1 of the previous guarantee slip, if E2-S1 are less than or equal to 15, judging that the guarantee is continued, and if not, judging that the guarantee is not continued. The end period En of the last policy is subtracted from the start period S1 of the first policy in all the consecutively applied policies of the user to obtain the consecutive purchase age (CCV _ C) of the user. All the policy generated as policemen by the policyholder in the user window period are taken out in the server, and the payment amount of the user, namely the current premium (CCV _ M), is obtained by summing the sign and premium of the policy. And taking out all the insurance policies in the user window period in the server, and summing to calculate the total signing insurance premium W, the market cost M and the total claim amount S. The total sign-off premium W is subtracted by the market cost M and by the total amount of the claim S to obtain the net profit P of the customer, i.e., the current profit (CCV _ P).

The Development value is used to represent the Development value (Customer Development value CDV) of the user to the company, and is the corresponding value of the user in the future time period. In one example, the development value CDV embodies the ability of a user to create value in the future, and a time sequence model can be built in the server to predict the transaction level of the client in the coming years according to the current transaction level of the user. Specifically, the user development value includes a predicted transaction index, and the predicted transaction index is used for measuring the user development value. Such as forecast trading index CDV _ R for developing purchase interval, CDV _ F for developing purchase frequency, CDV _ C for developing continuous purchase age, CDV _ M for developing premium and CDV _ P for reaching standard developing profit.

The potential value is the potential value of the user itself (Customer potential value CPV), which the user may have in the future time period. In one embodiment, the user has a large purchasing potential, but the purchasing potential cannot be reflected in the transaction information in the target business system, and the potential value of the user can be calculated according to the purchasing power index of the user. The potential value of the user comprises a potential transaction index, and the potential transaction index can be used for measuring the potential value of the user. Specifically, the potential trade index CPV _ H represents house property value, CPV represents vehicle value, CPV _ I represents monthly income, and CPV _ P represents premium income duty.

Further, different transaction indexes have dimensional differences due to different data types, for example, the payment amount M of a user may be in the order of tens of thousands of dollars, and the continuous guarantee period C is within five years, but if the obtained transaction indexes are directly clustered, index values with larger dimensions account for larger weight, thereby affecting the accuracy of the final result. Therefore, in one embodiment, after the server obtains the multidimensional transaction index, the server further performs a standardization process on the transaction index to remove dimensional differences between transaction indexes with different characteristics. Specifically, for any one of the transaction indexes (the current transaction index, the predicted transaction index, and the potential transaction index), the average value μ of the transaction index is calculated, and the standard deviation σ of the transaction index is calculated, specifically, the calculation formula of the standard deviation σ is shown in formula (3).

（3）

For each trading index x, a normalization process is performed by equation (4).

（4）

The data obtained after the transaction indexes are subjected to standardization processing through the formula (4) conform to standard normal distribution, dimensional differences of different transaction indexes are eliminated, the influence on the characteristic weight of the clustering model is eliminated, the accuracy rate of the model is improved, and the clustering convergence speed is also improved.

In the embodiment, in the process of clustering the users, the values of the users are described by the indexes of three dimensions (the current transaction index, the predicted transaction index and the potential transaction index), and the values of the dimensions are represented by a plurality of indexes in the three dimensions, so that the user values are more accurately and comprehensively measured, and the classification accuracy of the user values is improved.

In one embodiment, determining the user's present value, development value, and potential value from the respective user data includes: in the calculation thread, determining the existing value of the user according to the transaction information in the user data; determining the potential value of the user according to the asset information in the user data; constructing a prediction model according to the user data to determine the development value of the user according to the prediction model; clustering the users according to the clustering indexes to obtain various user value classification sets, comprising the following steps: in the value determining thread, clustering each user according to the existing value, the development value and the potential value of each user to obtain a plurality of user value category sets; wherein the computation thread works in parallel with the value determination thread.

The transaction information is the information of the stream of the user in the transaction process. Such as transaction information, which may include transaction time, transaction amount, transaction type, etc. Further, the transaction frequency and the like can be determined according to the transaction time in the transaction information. It can be understood that the transaction information is information corresponding to the transaction behavior that has occurred, so the existing value of the user can be determined from the transaction information.

Specifically, the user's existing value is determined in the computing thread based on the frequency of purchases in the transaction information, the age of successive purchases, the existing premium, and the existing profit. In one embodiment, an existing value formula model can be constructed according to historical transaction information corresponding to a plurality of transaction times of a plurality of users in a historical time period, and then the transaction information of the current time is brought into the existing value formula model to obtain the existing value of the users. In another embodiment, the machine learning model may also be trained according to a plurality of historical transaction information corresponding to a plurality of users in a historical time period to obtain an existing value model, and the existing value corresponding to the user at the current time may be calculated according to the current transaction information and the existing value model. In other embodiments, a current index weight may also be assigned to each current transaction index in the transaction information, and then the current value of the user may be calculated in the calculation thread based on the current index weight and each current transaction index.

Specifically, the potential value of the user is determined in the computing thread according to the asset information in the user data. In one embodiment, the machine learning model may be trained according to the asset information and the potential value of a plurality of users to obtain a potential value determination model, and in a specific implementation, the potential value of a user may be determined according to the asset information and the potential value determination model of a current user. In another embodiment, a potential value formula model can be constructed according to the asset information of a plurality of users, and then the potential value of the current user can be obtained by bringing the asset information of the current user into the potential value formula model. In one embodiment, a potential indicator weight may be assigned to each potential trading indicator in the asset information, and then the potential value of the user is calculated in a calculation thread in the server based on the potential indicator weight and the respective potential trading indicator.

And constructing a prediction model in a computing thread in the server according to the user data so as to determine the development value of the user according to the prediction model. In one embodiment, the machine learning model may be trained according to current transaction information and development value of a plurality of users to obtain a prediction value determination model, and in a specific implementation, the development value of a user may be determined according to the transaction information and the prediction value determination model of the current user. In another embodiment, a predictive value formula model can be constructed according to the transaction information of the user, and then the development value of the user can be obtained by bringing the current transaction information of the user into the predictive value formula model. In one embodiment, a development index weight may be assigned to each current trading index in the trading information, and then a development value of the user may be calculated in a calculation thread based on the development index weight and each current trading index.

In a value determining thread in a server, clustering users according to the existing value, the development value and the potential value of each user to obtain a plurality of user value category sets; wherein the computation thread works in parallel with the value determination thread.

In the above embodiment, the user data is processed in parallel in a plurality of threads in the server, so as to obtain the user value category set. And determining the multi-dimensional value information of the user in the computing thread, and determining the value category of the user according to the multi-dimensional value information of the user in the value determining thread. The efficiency of clustering the users is improved through the multi-thread parallel processing. And the data of different stages of the user are processed in parallel among different threads, the different threads are not interfered with each other, for example, when the user is clustered in the value determining thread, the value information of other users can be calculated in the calculating thread, so that the different users are clustered simultaneously in different threads, and the efficiency of clustering the user value is greatly improved.

In one embodiment, building a predictive model from user data to determine a developmental value of a user from the predictive model includes: extracting a plurality of current transaction indicators from the transaction information; acquiring a plurality of historical time sequence factors corresponding to the current transaction index; each historical time sequence factor is respectively corresponding to different transaction time; acquiring factor weights corresponding to the historical time sequence factors; according to the historical time series factors and the factor weights, an index prediction model corresponding to each current transaction index is constructed, and a predicted transaction index corresponding to each current transaction index is determined according to the index prediction model; and determining the development value of the user according to the predicted transaction indexes corresponding to each current transaction index.

The historical time sequence factor is a corresponding numerical value in a plurality of historical transaction times. And each current transaction index corresponds to a plurality of historical time-series factors respectively. The plurality of historical time series factors corresponding to the existing purchase intervals are corresponding purchase intervals within historical years, such as 2 months for the product purchase interval in the past year, 4 months for the product purchase interval in the past two years, 3 months for the product purchase interval in the past three years, 4 months for the product purchase interval in the past four years, and the like. It is understood that the plurality of historical time series factors corresponding to the existing purchase intervals are 2, 4, 3, and 4, respectively. And then, a purchase interval prediction model can be determined according to the time interval of purchasing products for years so as to predict the data of the current time according to the data of the historical time.

Further, the server is further configured to obtain a factor weight corresponding to each historical time series factor, so as to adjust each historical time series factor by using the factor weight, so as to adapt to a real service scene. Specifically, the respective corresponding factor weights may be set according to the transaction time corresponding to each historical time series factor, for example, the weighting factor of the historical time series factor with the transaction time closer to the current time is set to be the largest, and the weighting factor of the historical time series factor with the transaction time farther from the current time is set to be the smallest.

Similarly, the current purchasing frequency, the current continuous purchasing age, the current premium and the current profitable prediction model building mode refer to the building mode of the purchasing detection prediction model, which is not described herein again.

In the above embodiment, the index prediction model of each current transaction index is determined according to the historical time series factor corresponding to each current transaction index and the factor weight corresponding to each historical time series factor. And predicting the predicted transaction indexes of the current transaction indexes according to the index prediction models of the current transaction indexes, and determining the development value of the user according to the predicted transaction indexes. In the process of obtaining the predicted transaction indexes, the transaction indexes are obtained through respective corresponding model prediction, and the historical time series factors are adjusted by self-adapting factor weights according to respective corresponding actual conditions, so that the transaction indexes are more accurately obtained.

In one embodiment, the method further comprises: for each current transaction index, acquiring a plurality of fluctuation values corresponding to the current transaction index; each fluctuation value corresponds to different transaction time; acquiring fluctuation weights corresponding to the fluctuation values; according to the historical sequence factors and the factor weights, an index prediction model corresponding to each current transaction index is constructed, so that a predicted transaction index corresponding to each current transaction index is determined according to the index prediction model, and the method comprises the following steps: and constructing an index prediction model corresponding to each current transaction index according to each historical sequence factor, each factor weight, each fluctuation value and each fluctuation weight so as to determine a predicted transaction index corresponding to each current transaction index according to the index prediction model.

Wherein, the fluctuation value is a numerical value which has an influence on the current transaction index in the service scene. In a particular business, a transaction index X is predicted_tAnd the historical sequence factors of the previous p stages are correlated to simulate the situation of no interference under an ideal state according to the historical sequence factors of the previous p stages. And the predicted transaction index is also related to the fluctuation value of the previous q-period so as to simulate the real business situation according to the fluctuation value of the previous q-period. Therefore, in specific implementation, the historical sequence factor of the previous p period and the fluctuation value of the previous q period can be combined to predict the transaction index. Specifically, according to each historical sequence factor, each factor weight, each fluctuation value and each fluctuation weight, the index prediction model corresponding to each current transaction index is constructed, and the development value of a user in the behavior mode of the user can be accurately described. And (5) as shown in the formula (5), the index prediction model corresponds to each current transaction index.

（5）

Wherein, X_tFor the predicted trade index to be obtained, X_t-pA historical sequence factor, phi, representing the user's previous p-phase worth_pA factor weight representing each historical sequence factor. Theta_qRepresents the fluctuation value q period before the customer value, and epsilon represents the fluctuation weight.

Specifically, the values of p and q may be set within a preset range, for example, p and q are set to be values less than or equal to 3, then the values of p and q are arbitrarily combined to obtain a plurality of bayesian information component BIC matrices, the information content of the bayesian information component BIC matrix corresponding to each combination is calculated, and the values of p and q in the bayesian information component BIC matrix with the minimum value corresponding to the information content are extracted as final values to complete the order determination of p and q.

In the embodiment, the index prediction model corresponding to each current transaction index is established by using the determined p and q values, and the development value of the user is predicted based on the existing value of the user over the years. And moreover, according to the historical sequence factors, the factor weights, the fluctuation values and the fluctuation weights, the index prediction model corresponding to each current transaction index is constructed, and the development value of a user in the behavior mode can be accurately described.

In one embodiment, clustering the users according to the clustering index to obtain a plurality of user value category sets, including: presetting a clustering quantity interval, wherein the clustering quantity interval comprises more than one clustering quantity; sequentially reading the cluster quantity from the cluster quantity interval according to the size sequence, and clustering the users based on the cluster quantity to obtain cluster distortion values corresponding to the cluster quantities, and extracting the latest cluster quantity as the optimal cluster quantity when the difference value between the latest cluster quantity and the cluster distortion value of the previous cluster quantity is smaller than a preset value; and clustering the users according to the optimal clustering quantity to obtain various user value classification sets.

In one embodiment, obtaining a cluster distortion value corresponding to each cluster quantity includes: for each clustering quantity, acquiring a plurality of user value classification sets obtained by clustering users according to the clustering quantity; for each cluster quantity, acquiring a value central point and a non-central point of each user value category set; and for each cluster quantity, determining the distortion value of each user value category set according to the distance value between each value central point and the non-central point, and obtaining the cluster distortion value of the cluster quantity according to the distortion value of each user value category set.

Wherein the cluster number interval is a set including a plurality of cluster numbers. The clustering distortion value is a numerical value for measuring the accuracy of the clustering result.

In one embodiment, clustering of user values may be performed in the server using a k-means algorithm. Specifically, the k-means algorithm is used as an unsupervised learning method, and needs to manually determine the clustered population number before training and then start training. It can be understood that for a strange user group, the finer the user category classification, the higher the classification accuracy. But the effectiveness of classification is reduced, and the differentiation between different users cannot be reflected. Therefore, in a specific implementation, a cluster distortion value (SSE) can be preset in the server as a cost function to measure the distortion degree corresponding to each cluster number, and for each cluster number, the distortion degree is equal to the square sum of the distances between the center of the cluster and the positions of its internal members, as shown in formula (6).

（6）

Wherein the content of the first and second substances,

denotes the center of the j-th cluster, x_iThe transaction index is predicted for the ith transaction index. In one embodiment, the step of determining the number of clusters comprises: an approximate range of a clustering number k is firstly defined according to a common sense range to obtain a clustering number interval, then a clustering model is generated for each k value (clustering number) in the clustering number interval, and then the accuracy of the model is measured by a clustering distortion value (SSE), specifically a clustering error square sum index. As the k value increases, the number of clusters generated increases, and the SSE gradually decreases and eventually becomes stable. The point at which this transition from fall to plateau is the number of clusters of communities that are best suited to the data. As shown in fig. 3, fig. 3 is a graph illustrating variation of cluster distortion values corresponding to different cluster numbers in an embodiment. It can be seen from the k-SSE line graph in FIG. 3 that the cluster number rises slightly after 10 and then levels off. Thereby can ensureThe number of clusters most suitable for user data is then determined to be 10. So when constructing the k-means model, the selected k value is 10.

In other embodiments, the clustering center may also be initialized in the server first and the clustering algorithm is executed. In particular, a better model can be generated by the initial center points being as far apart from each other as possible. The method comprises the following specific steps: step one, initializing an empty set M for storing the central points of the selected k clustering centers. And step two, randomly selecting a first central point mu from the input samples and adding the first central point mu into the set M. And step three, for any sample point x outside the set M, finding the sample d (x, M) with the minimum distance to the sample point x through calculation. And step four, randomly selecting the next central point mu by using the weighted probability distribution. And repeating the second step and the third step until k central points are selected. And step five, executing a k-means algorithm based on the selected central point.

In one embodiment, the method further comprises: obtaining value center characteristic values corresponding to the user value category sets; converting each value center characteristic value into real service data; checking each user value category set according to the real service data; and removing the user value category sets which fail to pass the verification, and obtaining various user value category sets according to the remaining data after removal.

Specifically, after the clustering model is trained, user data is clustered according to the clustering model to obtain a user value category set. And then, obtaining value center characteristic values corresponding to the user value category sets in the server, and approximately estimating the characteristics of all the users of the cluster according to the value center characteristic values of each cluster center. And then reversely reducing the value center characteristic value into a true value of the service through a z-score algorithm, and then removing abnormal data which obviously do not conform according to service logic.

In one embodiment, RFCMP (current transaction indexes CCV _ R represents the existing purchase interval, CCV _ F represents the existing purchase frequency, CCV _ C represents the existing continuous purchase age, CCV _ M represents the existing premium and CCV _ P represents the existing profit) five-dimensional transaction indexes in the current transaction indexes of the user are input into a pre-trained clustering model, and different classes of user value grades are clustered. For example, when a user is grouped into 7 categories, the data at the cluster center of each category represents the data features belonging to that category. When the data of the clustering centers in the individual categories are found to be abnormal, the characteristic values of the clustering centers can be restored through a reverse algorithm to obtain a business true value, so that the business true value is checked, and the abnormal data in the business true value is extracted.

Further, the abnormal clustering found according to the clustering center may include: sometimes, an agent may be applied on behalf of a large number of users, so that the information of the agent is clustered instead of the information of the concerned actual user, and therefore, the data needs to be processed again at this time to obtain the clustered information of the users instead of the information of the agent. Or, when the service personnel enters the user information, an entry error may occur, for example, 2 million data with a premium may be erroneously entered into 2 million, so that the user data is individually clustered into one class, and at this time, the cluster is also an abnormal cluster and is also cluster data that needs to be re-checked. As shown in table one, the cluster center is generated for the clustering algorithm.

Table-clustering algorithm generated cluster centers

As can be seen from Table I, the data of "-2914731.12", "26247.68", "5332.36", "750.40" and "60654089.36" belong to several abnormal cluster centers, wherein the premium of the class 2 cluster centers is 6000w yuan, the number of times of the class 3 cluster centers is 2w, the last guarantee period of the class 5 cluster centers is-291 w days, the number of times of the class 8 cluster centers is 5000, and the number of times of the class 9 cluster centers is 750. Selecting a specific user of the clustering center, returning to a service system for query, and finding out the problems existing in the outliers comprises the following steps: the insurance policy data includes credit insurance data, the insurance policy of the dangerous type is applied by a credit company on the name of a borrower, and the calculation mode is different from the normal insurance policy and should be removed in the calculation. Or partial dangerous species, such as accidental injury risk of taking a vehicle, the situation that a plurality of insurance policies are repeatedly applied exists, and the preliminary presumption is a method for making the policy of the branch company. Or the phenomenon that the insurance policy underwritten for the brokerage channel is applied to a plurality of people by using the identity information of one person, can cause a large amount of insurance policies to be gathered on the identity information of one user. After the examination, the 2, 3, 5, 8, and 9 cluster centers were confirmed as abnormal data, so these cluster centers were deleted in the final result. And keeping the clustering centers with the sequence numbers of 0, 1, 4, 6 and 7.

In the above embodiment, it is proved that the clustering model also has a function of checking abnormal data by outliers. Under the condition that the data quality cannot be guaranteed, partial outliers can be eliminated through the clustering model. And the clusters of the rest normal user groups are numbered again, and finally, five types of users are obtained. And further improves the accuracy of the clustering result.

In another embodiment, the method further comprises the step of semantization and imaging of the clustering result in the server. Specifically, the value center characteristic value generated by the clustering algorithm is compared with the average value, median and quantile of the value center characteristic value, and converted into more visual quantitative adjectives, such as very high, higher and lower. And simultaneously, according to the corresponding relation between the value center characteristic value and the user value, converting the user number generated by clustering into a user group name with service meaning, such as: new general value users, medium value users, high value users, claim loss users, loss low value users, and the like.

Specifically, the value center characteristic value corresponding to each clustering center is compared with the sample mean, median or quantile to obtain the levels of different clusters in the whole sample, for example, if the premium mean of the clustering center is 2 ten thousand and the premium corresponding to the whole user sample is 3 ten thousand, then the clustering center can be classified into a value grade with a lower premium.

In one embodiment, for each transaction index, the rank of the transaction index may be divided into 4 different ranks such as Q1, Q2, Q3, and Q4. And for any one transaction index value x: if x is less than or equal to Q1, the division x is extremely low; if x is greater than Q1 and x is less than or equal to Q2, then the partition x is lower; if x is greater than Q2 and x is less than or equal to Q3, then divide x medium; if x is greater than Q3 and x is less than or equal to Q4, then the partition x is higher; if x is equal to or greater than Q4, the partition x is extremely high. It should be noted that, for some transaction indexes having special determination rules in business, other corresponding determination criteria may also be added, which is not limited herein.

In one embodiment, a tabular representation of the value center feature values for a user is provided as shown in Table two. As shown in Table III, a table diagram for semanticizing the numerical values in Table II is provided.

Value center eigenvalues for table two users

Clustered semantic results of table three users

In the embodiment, the clustering center of the user is semantically displayed, so that the clustering result is more visual.

In one embodiment, after the obtaining the user data set, the method further comprises: preprocessing data in the user data set; the preprocessing includes at least one of data verification, data cleansing, and data normalization.

In an actual service scene, the user data in the user data set has an irregular entry condition, so that the clustering calculation of the user cannot be directly executed according to the entered user data. Therefore, in order to improve the effectiveness of the user data, the user data needs to be cleaned before clustering the users. Specifically, a user data cleansing rule may be preset to perform cleansing processing on user data.

In one embodiment, the data scrubbing may include scrubbing erroneous data in the user data. Specifically, due to data input errors, different representations caused by different source data, inconsistency between data and the like, the existing data has such or other dirty data, which mainly appears as follows: illegal values, entered non-specifications, inconsistent values, data duplication, and the like. The data cleaning function comprises removing unnecessary fields, cleaning format contents, filling vacancy values, cleaning logic errors, verifying data authenticity and the like.

Firstly, extracting determined customer information data from Oracle to a hive platform, inputting and extracting a base table and data to be cleaned, and performing information authenticity check according to specific rules, such as province check of an identity card, section verification of a mobile phone number and the like, so as to clean real and effective data. In one embodiment, data cleansing includes: 1. the method specifically requires that the length of the identity card number is 15 or 18 bits, the identity card number needs to be in accordance with regional code verification, the identity card number needs to be in accordance with identity card date verification, the identity card number needs to be in accordance with identity card check bit judgment, and the identity card number does not contain abnormal numbers such as '0000'. When the ID card code is determined to be unsatisfactory with any of the above requirements, the ID card code is nulled. 2. If the user and the field personnel set the identity card number or the mobile phone number to be the same numerical value, the identity card number or the mobile phone number is also set to be null. 3. The name only keeps pure Chinese, pure letters and blank spaces, and the data mixed with Chinese and English is removed. 4. The length of the mobile phone number is not equal to 11 bits, the unconventional mobile phone number is verified according to a given rule, or the number comprises unconventional numbers such as 000000' and the like, and the number is empty. 5. The name length is more than or equal to 3 bits, and contains the character of 'equal to' and is removed. 7. And 3 different clients using the same identity card number and mobile phone number are rejected. 8. The name contains "company", culling.

In one embodiment, the data cleansing rules include: a. and the cleaning rule (admission rule) comprises null value verification, identity card number verification, mobile phone number verification and the like, and when the field does not accord with the configured rule, a new value is given according to a specified default value. When the field is null or null, it is replaced with a character string 'null'. And finally, forming a corresponding new data record by each piece of original record data according to the data cleaning rule, if the data of the row is valid, entering the next ID to get through, and if not, filtering. b. And (4) checking a null value, judging whether the field value is the null value, and if so, giving a character string 'null' to a field default value. c. And (4) checking a Null value, judging whether the field value is the Null value, and if so, giving a character string Null to a field default value. d. Checking the ID card number, judging whether the ID card number is legal, whether the area code verification is effective, whether the ID card date verification is effective, judging the last bit of the ID card value, and judging the length of the ID card. The method specifically comprises the following steps: when the province code of the identity card is incorrect, the identity card is nulled; when the regular expression of the identity card is judged to be incorrect, emptying the identity card; when the check position of the identity card is incorrect, the identity card is empty; when the ID card contains '0000', the ID card is empty.

And (3) checking the mobile phone number, specifically, judging whether the mobile phone number is legal or not, such as judging the length of the mobile phone number, judging whether the mobile phone number starts with 1 or not, judging whether the mobile phone number is an abnormal number such as 1111111111 or not, and the like. The method comprises the following specific steps: when the length of the mobile phone number is not equal to 11, the mobile phone number is empty; when the mobile phone number is not started with 1, the mobile phone number is empty; when the mobile phone number contains '000000', the mobile phone number is empty; when the mobile phone number contains '11111111', the mobile phone number is empty; when the mobile phone number contains '22222222', the mobile phone number is set to be null; when the mobile phone number contains '33333333', the mobile phone number is empty; when the mobile phone number contains '44444444', the mobile phone number is empty; when the mobile phone number contains '5555555555', the mobile phone number is empty; when the mobile phone number contains '66666666', the mobile phone number is empty; when the mobile phone number contains '77777777', the mobile phone number is empty; when the mobile phone number contains '88888888', the mobile phone number is empty; when the mobile phone number contains '99999999', the mobile phone number is empty; when the mobile phone number contains '23456789', the mobile phone number is set to be null; when the mobile phone number contains '12345678', the mobile phone number is set to be null; when the mobile phone number contains '01234567', the mobile phone number is empty; when the mobile phone number contains '34567890', the mobile phone number is empty; when the mobile phone number contains '456789', the mobile phone number is empty; when the mobile phone number contains '1380013800', the mobile phone number is set to be empty.

The same process as the agent information. Specifically, when the data in the user basic information summary table is judged not to belong to the agent but the agent information is used, the corresponding information is nulled.

In one embodiment, the step of normalizing the data comprises: because a client may be reached by multiple portals through multiple paths, the same client may be tagged with multiple IDs on different systems. Also, when a user transacts a business several times, it may be considered as two clients because of the difference in the provided information. When analyzing the value of a client, the ID normalization is needed to collect the data of the client in all systems and all time periods, and the client ID is called through. Specifically, the data cut-through rule includes: the method comprises the steps of obtaining a user basic information data summary table in a server, generating a new user ID for a user as a unique identification of the user, storing the new user ID in the summary table at a first field position, wherein one client ID corresponds to a plurality of pieces of record data, but one record only belongs to one client ID.

And (4) ID opening: and adding a field which can identify the user uniquely by a specified rule to each piece of data, and simultaneously, keeping a main key of each piece of data in the source table by a field so as to trace back the source table, and finally storing the processed result data in the hive data warehouse. Specifically, the rules currently used as the rules for identifying users are as follows: determining a user by the first two digits of the name plus the certificate number; name + mobile phone number, determining a user; determining a user by the certificate number and the mobile phone number; name + bank card number, determining a user; name + micro-signal identifies a user; the mobile phone number and the bank card number determine a user; determining a user by the mobile phone number and the micro signal; name + device ID, determining a user; mobile phone number + device ID, determine a user, etc. And are not intended to be limiting herein.

A customer service request processing method comprises the following steps: receiving a customer service request, wherein the customer service request carries customer service data; according to the customer service data, calculating by using a user value clustering method provided in any one of the above embodiments to obtain a customer service value category set; acquiring a service strategy corresponding to the customer service value category set; and processing the customer service request according to the service policy.

In a specific application scene, value category set classification can be carried out on customer service, and high-quality customer service distribution is realized. And the service policy may be a service policy, or an increase in allocation level, etc.

Further, the user group can be subdivided according to the table to obtain a plurality of levels of customer service value classification sets. Wherein the category 1 user group: is a new general value customer. This class of customers is new to the initial application, and the customer's policy will expire in the near future. For such customers, better service and contact enhancement can be provided to see if long-term customers can be transformed. Class 2 user groups: is a medium value customer. This type of user has purchased a product several times, has a certain customer value, and has not purchased a product again for some time. For such customers, return visits can be appropriately made to determine whether the customers are lost or not, and the customers can be recovered. Class 3 user groups: is a high value customer. This type of user has been insuring for many years and has purchased more products, which are core customers. The user's policy also expires for a longer period of time, and can be inferred to be the holder of the long-term policy, which is not easily broken. Aiming at the customers, personalized high-quality service should be provided, and customer experience is improved. 4 types of user groups: and the method is a loss customer of the claim. The number of times this class of users applies a guarantee is slightly higher than the average, but a huge amount of claims appear. According to the latest insurance deadline, the user can be found to have not been applied for more than one year, and can be preliminarily judged as a lost user. Such customers are high risk customers of insurance companies and where there is a possibility of fraud, fraud and fraud, should be noted in subsequent underwriting work. 5 types of user groups: is a lost low value customer. This type of user purchases fewer products and has a relatively low premium. The user can be judged to be lost through the fact that the user is not guaranteed for about three years through the latest final guarantee period.

Meanwhile, as shown in fig. 4, fig. 4 is a distribution diagram of cluster centers at different levels provided in an embodiment. Specifically, a MinMaxScale (data normalization) method is applied in the server, the characteristic value of the clustering center is scaled in an equal ratio mode, a five-dimensional graph is drawn, the distribution situation of a single clustering user on five characteristics of RFCMP and the characteristic difference of different clustering users are visually shown, and service personnel can conveniently know the customer attributes.

In one embodiment, it is an application flow. The method specifically comprises the following steps:

1) and selecting a window period, collecting the underwriting data and the claim settlement data of the client in the window period from the underwriting system and the claim settlement system respectively, and processing the data after cleaning into five indexes of RFCMP.

2) Roughly determining the value of k, and selecting the number k of the clustering communities which are most suitable for the client data by an elbow method.

3) And repeatedly and iteratively training by using the generated client RFCMP index until the clustering model converges.

4) And (4) restoring the clustering center of the model into real business data, and eliminating outliers according to business logic without conforming customers.

5) And storing the model after the outlier is eliminated.

6) And semantization and imaging are carried out on the result of the clustering algorithm, and each client is given a client group name with business meaning. And storing the client group name of each client as a label of the client in the client data analysis application platform.

7) Differentiated service or differentiated claim settlement is carried out through the client tags, and each time a client arrives, the client service seat can inquire the client value tags through the client data analysis platform. The label is used as a reference, and different dialogues and service strategies are used for providing personalized and targeted services for the user.

It should be understood that, although the steps in the flowchart of fig. 2 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 2 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.

In one embodiment, as shown in fig. 5, there is provided a user value clustering device 500, which may be a part of a computer device using a software module or a hardware module, or a combination of the two, and specifically includes:

an obtaining module 502, configured to obtain a user data set; the user data set comprises user data corresponding to more than one user;

an index determination module 504, configured to determine a current transaction index, a predicted transaction index, and a potential transaction index according to each user data;

a relevancy determination module 506, configured to determine importance of the current transaction index, the predicted transaction index, and the potential transaction index;

a clustering determination module 508 for determining a clustering index according to the importance;

and the clustering module 510 is configured to cluster the users according to the clustering index to obtain a plurality of user value sets.

In one embodiment, the metrics determination module 504 is further configured to determine an existing value, a developmental value, and a potential value of the user based on the respective user data; extracting current trading indexes from the existing value, extracting predicted trading indexes from the development value, and extracting potential trading indexes from the potential value.

In one embodiment, the indicator determination module 504 is further configured to determine, in the computing thread, the user's existing value based on the transaction information in each user's data; determining the potential value of the user according to the asset information in the user data; constructing a prediction model according to the user data to determine the development value of the user according to the prediction model; the clustering module 510 is further configured to cluster the users according to the existing values, development values, and potential values of the users in the value determination thread to obtain a plurality of user value category sets; wherein the computation thread works in parallel with the value determination thread.

In one embodiment, the metric determination module 504 is further configured to extract a plurality of current transaction metrics from the transaction information; acquiring a plurality of historical time sequence factors corresponding to the current transaction index; each historical time sequence factor is respectively corresponding to different transaction time; acquiring factor weights corresponding to the historical time sequence factors; according to the historical time series factors and the factor weights, an index prediction model corresponding to each current transaction index is constructed, and a predicted transaction index corresponding to each current transaction index is determined according to the index prediction model; and determining the development value of the user according to the predicted transaction indexes corresponding to each current transaction index.

In one embodiment, the user value clustering device further includes a weight determining module 512, where the weight determining module 512 is configured to, for each current transaction index, obtain a plurality of fluctuation values corresponding to the current transaction index; each fluctuation value corresponds to different transaction time; and acquiring fluctuation weights corresponding to the fluctuation values. The index determining module 504 is further configured to construct an index prediction model corresponding to each current transaction index according to each historical sequence factor, each factor weight, each fluctuation value, and each fluctuation weight, so as to determine a predicted transaction index corresponding to each current transaction index according to the index prediction model.

In an embodiment, the clustering module 510 is further configured to preset a cluster number interval, where the cluster number interval includes more than one cluster number; sequentially reading the cluster quantity from the cluster quantity interval according to the size sequence, and clustering the users based on the cluster quantity to obtain cluster distortion values corresponding to the cluster quantities, and extracting the latest cluster quantity as the optimal cluster quantity when the difference value between the latest cluster quantity and the cluster distortion value of the previous cluster quantity is smaller than a preset value; and clustering the users according to the optimal clustering quantity to obtain various user value classification sets.

In an embodiment, the clustering module 510 is further configured to, for each cluster quantity, obtain a plurality of user value category sets obtained by clustering users according to the cluster quantity; for each cluster quantity, acquiring a value central point and a non-central point of each user value category set; and for each cluster quantity, determining the distortion value of each user value category set according to the distance value between each value central point and the non-central point, and obtaining the cluster distortion value of the cluster quantity according to the distortion value of each user value category set.

In one embodiment, the user value clustering device further comprises a conversion module, wherein the conversion module is used for acquiring the value center characteristic value corresponding to each user value category set; converting each value center characteristic value into real service data; checking each user value category set according to the real service data; and removing the user value category sets which fail to pass the verification, and obtaining various user value category sets according to the remaining data after removal.

In one embodiment, the user value clustering device further comprises a preprocessing module, wherein the preprocessing module is used for preprocessing data in the user data set; the preprocessing includes at least one of data verification, data cleansing, and data normalization.

In one embodiment, as shown in fig. 6, there is provided a customer service request processing apparatus 600, which may be a part of a computer device using a software module or a hardware module, or a combination of the two, and specifically includes:

the receiving module 602 is configured to receive a customer service request, where the customer service request carries customer service data.

And a calculating module 604, configured to calculate, according to the customer service data, a customer service value category set by using a user value clustering method.

A policy obtaining module 606, configured to obtain a service policy corresponding to the customer service value category set.

And the processing module 608 is configured to process the customer service request according to the service policy.

For the specific definition of the user value clustering device, reference may be made to the above definition of the user value clustering method, which is not described herein again. All or part of each module in the user value clustering device can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing user data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a user value clustering method.

In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 7. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a user value clustering method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory storing a computer program and a processor implementing the following steps when the processor executes the computer program: acquiring a user data set; the user data set comprises user data corresponding to more than one user; determining a current transaction index, a predicted transaction index and a potential transaction index according to each user data; determining the importance of the current transaction index, the predicted transaction index, and the potential transaction index; determining a clustering index according to the importance; and clustering the users according to the clustering indexes to obtain various user value classification sets.

In one embodiment, the processor, when executing the computer program, further performs the steps of: determining the existing value, the development value and the potential value of the user according to the user data; extracting a current trading index from the existing value, extracting a predicted trading index from the developmental value, and extracting a potential trading index from the potential value.

In one embodiment, the processor, when executing the computer program, further performs the steps of: in a computing thread, determining the existing value of a user according to transaction information in each user data; determining the potential value of the user according to the asset information in each user data; and constructing a prediction model according to the user data so as to determine the development value of the user according to the prediction model.

In one embodiment, the processor, when executing the computer program, further performs the steps of: in a value determining thread, clustering the users according to the existing values, the development values and the potential values of the users to obtain a plurality of user value classification sets; wherein the computing thread works in parallel with the value determining thread.

In one embodiment, the processor, when executing the computer program, further performs the steps of: extracting a plurality of current transaction metrics from the transaction information; acquiring a plurality of historical time series factors corresponding to the current transaction index; each historical time sequence factor corresponds to different transaction time respectively; acquiring factor weight corresponding to each historical time sequence factor; according to the historical time series factors and the factor weights, an index prediction model corresponding to each current transaction index is built, and a predicted transaction index corresponding to each current transaction index is determined according to the index prediction model; and determining the development value of the user according to the predicted transaction indexes corresponding to the current transaction indexes.

In one embodiment, the processor, when executing the computer program, further performs the steps of: for each current transaction index, acquiring a plurality of fluctuation values corresponding to the current transaction index; each fluctuation value corresponds to different transaction time; and acquiring fluctuation weights corresponding to the fluctuation values.

In one embodiment, the processor, when executing the computer program, further performs the steps of: and constructing an index prediction model corresponding to each current transaction index according to each historical sequence factor, each factor weight, each fluctuation value and each fluctuation weight, so as to determine a predicted transaction index corresponding to each current transaction index according to the index prediction model.

In one embodiment, the processor, when executing the computer program, further performs the steps of: presetting a clustering quantity interval, wherein the clustering quantity interval comprises more than one clustering quantity; sequentially reading the cluster quantity from the cluster quantity interval according to the size sequence, clustering the users based on the cluster quantity to obtain cluster distortion values corresponding to the cluster quantities, and extracting the latest cluster quantity as the optimal cluster quantity when the difference value between the latest cluster quantity and the cluster distortion value of the previous cluster quantity is smaller than a preset value; and clustering the users according to the optimal clustering quantity to obtain various user value classification sets.

In one embodiment, the processor, when executing the computer program, further performs the steps of: for each clustering quantity, acquiring a plurality of user value classification sets obtained by clustering users according to the clustering quantity; for each cluster quantity, obtaining a value central point and a non-central point of each user value category set; and for each clustering quantity, determining the distortion value of each user value category set according to the distance value between each value central point and the non-central point, and obtaining the clustering distortion value of the clustering quantity according to the distortion value of each user value category set.

In one embodiment, the processor, when executing the computer program, further performs the steps of: obtaining a value center characteristic value corresponding to each user value category set; converting each value center characteristic value into real service data; checking each user value category set according to the real service data; and removing the user value category sets which fail to pass the verification, and obtaining various user value category sets according to the remaining data after removal.

In one embodiment, the processor, when executing the computer program, further performs the steps of: preprocessing data in the user data set; the preprocessing comprises at least one of data verification, data cleaning and data normalization.

In one embodiment, a computer device is provided, comprising a memory storing a computer program and a processor implementing the following steps when the processor executes the computer program: receiving a customer service request, wherein the customer service request carries customer service data; calculating to obtain a customer service value classification set by the user value clustering method according to the customer service data; acquiring a service strategy corresponding to the customer service value category set; and processing the customer service request according to the service strategy.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which when executed by a processor performs the steps of: acquiring a user data set; the user data set comprises user data corresponding to more than one user; determining a current transaction index, a predicted transaction index and a potential transaction index according to each user data; determining the importance of the current transaction index, the predicted transaction index, and the potential transaction index; determining a clustering index according to the importance; and clustering the users according to the clustering indexes to obtain various user value classification sets.

In one embodiment, the computer program when executed by the processor further performs the steps of: determining the existing value, the development value and the potential value of the user according to the user data; extracting a current trading index from the existing value, extracting a predicted trading index from the developmental value, and extracting a potential trading index from the potential value.

In one embodiment, the computer program when executed by the processor further performs the steps of: in a computing thread, determining the existing value of a user according to transaction information in each user data; determining the potential value of the user according to the asset information in each user data; and constructing a prediction model according to the user data so as to determine the development value of the user according to the prediction model.

In one embodiment, the computer program when executed by the processor further performs the steps of: in a value determining thread, clustering the users according to the existing values, the development values and the potential values of the users to obtain a plurality of user value classification sets; wherein the computing thread works in parallel with the value determining thread.

In one embodiment, the computer program when executed by the processor further performs the steps of: extracting a plurality of current transaction metrics from the transaction information; acquiring a plurality of historical time series factors corresponding to the current transaction index; each historical time sequence factor corresponds to different transaction time respectively; acquiring factor weight corresponding to each historical time sequence factor; according to the historical time series factors and the factor weights, an index prediction model corresponding to each current transaction index is built, and a predicted transaction index corresponding to each current transaction index is determined according to the index prediction model; and determining the development value of the user according to the predicted transaction indexes corresponding to the current transaction indexes.

In one embodiment, the computer program when executed by the processor further performs the steps of: for each current transaction index, acquiring a plurality of fluctuation values corresponding to the current transaction index; each fluctuation value corresponds to different transaction time; and acquiring fluctuation weights corresponding to the fluctuation values.

In one embodiment, the computer program when executed by the processor further performs the steps of: and constructing an index prediction model corresponding to each current transaction index according to each historical sequence factor, each factor weight, each fluctuation value and each fluctuation weight, so as to determine a predicted transaction index corresponding to each current transaction index according to the index prediction model.

In one embodiment, the computer program when executed by the processor further performs the steps of: presetting a clustering quantity interval, wherein the clustering quantity interval comprises more than one clustering quantity; sequentially reading the cluster quantity from the cluster quantity interval according to the size sequence, clustering the users based on the cluster quantity to obtain cluster distortion values corresponding to the cluster quantities, and extracting the latest cluster quantity as the optimal cluster quantity when the difference value between the latest cluster quantity and the cluster distortion value of the previous cluster quantity is smaller than a preset value; and clustering the users according to the optimal clustering quantity to obtain various user value classification sets.

In one embodiment, the computer program when executed by the processor further performs the steps of: for each clustering quantity, acquiring a plurality of user value classification sets obtained by clustering users according to the clustering quantity; for each cluster quantity, obtaining a value central point and a non-central point of each user value category set; and for each clustering quantity, determining the distortion value of each user value category set according to the distance value between each value central point and the non-central point, and obtaining the clustering distortion value of the clustering quantity according to the distortion value of each user value category set.

In one embodiment, the computer program when executed by the processor further performs the steps of: obtaining a value center characteristic value corresponding to each user value category set; converting each value center characteristic value into real service data; checking each user value category set according to the real service data; and removing the user value category sets which fail to pass the verification, and obtaining various user value category sets according to the remaining data after removal.

In one embodiment, the computer program when executed by the processor further performs the steps of: preprocessing data in the user data set; the preprocessing comprises at least one of data verification, data cleaning and data normalization.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which when executed by a processor performs the steps of: receiving a customer service request, wherein the customer service request carries customer service data; calculating to obtain a customer service value classification set by the user value clustering method according to the customer service data; acquiring a service strategy corresponding to the customer service value category set; and processing the customer service request according to the service strategy.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method for clustering user value, the method comprising:

determining a current transaction index, a predicted transaction index and a potential transaction index according to each user data; the current transaction index generation mode comprises the following steps: determining the existing value according to the transaction information of the user data, and extracting the current transaction index from the existing value; the potential transaction index is generated in a manner that includes: determining the potential value of a user according to the asset information of the user data, and extracting potential transaction indexes from the potential value; the predicted transaction index is extracted and generated from development value, and the development value is determined in a mode comprising the following steps: extracting a plurality of current transaction indicators from the transaction information; acquiring a plurality of historical time series factors corresponding to the current transaction index; each historical time sequence factor corresponds to different transaction time respectively; acquiring factor weight corresponding to each historical time sequence factor; for each current transaction index, acquiring a plurality of fluctuation values corresponding to the current transaction index; each fluctuation value corresponds to different transaction time; acquiring fluctuation weights corresponding to the fluctuation values; according to the historical time sequence factors, the factor weights, the fluctuation values and the fluctuation weights, an index prediction model corresponding to each current transaction index is built, and a predicted transaction index corresponding to each current transaction index is determined according to the index prediction model; determining the development value of the user according to the predicted transaction indexes corresponding to the current transaction indexes;

determining the importance of the current transaction index, the predicted transaction index, and the potential transaction index;

determining a clustering index according to the importance;

2. The method of claim 1, wherein determining the existing value from transaction information of the user data comprises:

in a computing thread, determining the existing value of a user according to transaction information in each user data;

clustering the users according to the clustering indexes to obtain various user value classification sets, including:

in a value determining thread, clustering the users according to the existing values, the development values and the potential values of the users to obtain a plurality of user value classification sets; wherein the computing thread works in parallel with the value determining thread.

3. The method of claim 1, wherein the clustering each of the users according to the clustering index to obtain a plurality of user value category sets comprises:

presetting a clustering quantity interval, wherein the clustering quantity interval comprises more than one clustering quantity;

sequentially reading the cluster quantity from the cluster quantity interval according to the size sequence, clustering the users based on the cluster quantity to obtain cluster distortion values corresponding to the cluster quantities, and extracting the latest cluster quantity as the optimal cluster quantity when the difference value between the latest cluster quantity and the cluster distortion value of the previous cluster quantity is smaller than a preset value;

and clustering the users according to the optimal clustering quantity to obtain various user value classification sets.

4. The method of claim 3, wherein said obtaining a cluster distortion value for each of said cluster quantities comprises:

for each clustering quantity, acquiring a plurality of user value classification sets obtained by clustering users according to the clustering quantity;

for each cluster quantity, obtaining a value central point and a non-central point of each user value category set;

and for each clustering quantity, determining the distortion value of each user value category set according to the distance value between each value central point and the non-central point, and obtaining the clustering distortion value of the clustering quantity according to the distortion value of each user value category set.

5. The method of claim 1, further comprising:

obtaining a value center characteristic value corresponding to each user value category set;

converting each value center characteristic value into real service data;

checking each user value category set according to the real service data;

and removing the user value category sets which fail to pass the verification, and obtaining various user value category sets according to the remaining data after removal.

6. The method of claim 1, wherein after the obtaining the user data set, further comprising:

preprocessing data in the user data set; the preprocessing comprises at least one of data verification, data cleaning and data normalization.

7. A customer service request processing method is characterized by comprising the following steps:

receiving a customer service request, wherein the customer service request carries customer service data;

calculating a customer service value category set according to the customer service data by the method of any one of claims 1 to 6;

acquiring a service strategy corresponding to the customer service value category set;

and processing the customer service request according to the service strategy.

8. An apparatus for clustering user values, the apparatus comprising:

the index determining module is used for determining a current transaction index, a predicted transaction index and a potential transaction index according to the user data; the current transaction index generation mode comprises the following steps: determining the existing value according to the transaction information of the user data, and extracting the current transaction index from the existing value; the potential transaction index is generated in a manner that includes: determining the potential value of a user according to the asset information of the user data, and extracting potential transaction indexes from the potential value; the predicted transaction index is extracted and generated from development value, and the development value is determined in a mode comprising the following steps: extracting a plurality of current transaction indicators from the transaction information; acquiring a plurality of historical time series factors corresponding to the current transaction index; each historical time sequence factor corresponds to different transaction time respectively; acquiring factor weight corresponding to each historical time sequence factor; for each current transaction index, acquiring a plurality of fluctuation values corresponding to the current transaction index; each fluctuation value corresponds to different transaction time; acquiring fluctuation weights corresponding to the fluctuation values; according to the historical time sequence factors, the factor weights, the fluctuation values and the fluctuation weights, an index prediction model corresponding to each current transaction index is built, and a predicted transaction index corresponding to each current transaction index is determined according to the index prediction model; determining the development value of the user according to the predicted transaction indexes corresponding to the current transaction indexes;

a relevancy determination module for determining the importance of the current transaction index, the predicted transaction index and the potential transaction index;

the clustering determination module is used for determining clustering indexes according to the importance degrees;

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.