CN117349793A

CN117349793A - User grading method and device, electronic equipment and computer readable storage medium

Info

Publication number: CN117349793A
Application number: CN202311413078.1A
Authority: CN
Inventors: 李可新; 温伟; 彭连旺
Original assignee: Beijing Zhongguancun Kejin Technology Co Ltd
Current assignee: Beijing Zhongguancun Kejin Technology Co Ltd
Priority date: 2023-10-27
Filing date: 2023-10-27
Publication date: 2024-01-05

Abstract

The present disclosure provides a user grading method and apparatus, an electronic device, and a computer readable storage medium, the method comprising: acquiring a plurality of user information and a plurality of transaction behavior data corresponding to the plurality of user information; classifying the transaction behavior data through a plurality of different preset classification numbers to obtain classification results corresponding to each preset classification number, wherein the preset classification numbers are used for representing the number of categories contained in the classification results; calculating a contour coefficient according to a classification result corresponding to each preset classification number, and selecting a target classification number from a plurality of different preset classification numbers based on the contour coefficient; and calculating a user characteristic value corresponding to each category through transaction behavior data in each category under the target classification number, and determining the user classification corresponding to each category based on the user characteristic value corresponding to each category. The embodiment of the disclosure can accurately evaluate the importance degree of the user and improve the accuracy of user classification.

Description

User grading method and device, electronic equipment and computer readable storage medium

Technical Field

The disclosure relates to the technical field of data analysis, and in particular relates to a user grading method and device, electronic equipment and a computer readable storage medium.

Background

Customer relationship management (Customer Relationship Management, CRM for short) is a business management policy that effectively organizes business activities and implementing business processes according to customer-centric principles of enterprise resources in terms of customer classification, and by this means improves the profitability, profits, and customer satisfaction of the enterprise. In recent years, with the development of the trust industry, users in the trust industry are growing rapidly, and the inevitable need for managing the users in the trust industry in combination with CRM is felt.

Currently, there are two types of customer classification methods commonly used in the existing user relationship management: one way is to implement customer grading by calculating the weight of each user, and the other way is to implement customer grading by dividing the index into at least two ranges. However, the index of the user in the trust industry is complex, the classification number of the trust users obtained by adopting the method is unreasonable, and the practicability is poor, so that improvement is needed.

Disclosure of Invention

The disclosure provides a user grading method and device, electronic equipment and a computer readable storage medium.

In a first aspect, the present disclosure provides a user ranking method, the user ranking method comprising:

Acquiring a plurality of user information and a plurality of transaction behavior data corresponding to the user information, wherein the transaction behavior data corresponding to each user information is generated according to behavior feature data of a plurality of dimensions corresponding to the user information;

classifying the transaction behavior data through a plurality of different preset classification numbers to obtain classification results corresponding to each preset classification number, wherein the preset classification numbers are used for representing the number of categories contained in the classification results;

calculating a contour coefficient according to a classification result corresponding to each preset classification number, and selecting a target classification number from a plurality of different preset classification numbers based on the contour coefficient;

and calculating a user characteristic value corresponding to each category through the transaction behavior data in each category under the target classification number, and determining the user classification corresponding to each category based on the user characteristic value corresponding to each category.

In a second aspect, the present disclosure provides a user grading apparatus, comprising:

the system comprises an acquisition module, a storage module and a storage module, wherein the acquisition module is used for acquiring a plurality of user information and a plurality of transaction behavior data corresponding to the user information, wherein the transaction behavior data corresponding to each user information is generated according to behavior characteristic data of a plurality of dimensions corresponding to the user information;

The classification module is used for classifying the transaction behavior data through a plurality of different preset classification numbers to obtain classification results corresponding to each preset classification number, wherein the preset classification numbers are used for representing the category number contained in the classification results;

the selection module is used for calculating a contour coefficient according to a classification result corresponding to each preset classification number and selecting a target classification number from a plurality of different preset classification numbers based on the contour coefficient;

and the determining module is used for calculating the user characteristic value corresponding to each category through the transaction behavior data in each category under the target classification number, and determining the user classification corresponding to each category based on the user characteristic value corresponding to each category.

In a third aspect, the present disclosure provides an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores one or more computer programs executable by the at least one processor, one or more of the computer programs being executable by the at least one processor to enable the at least one processor to perform the user classification method described above.

In a fourth aspect, the present disclosure provides a computer readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor/processing core, implements the user grading method described above.

According to the user grading method and device, transaction behavior data corresponding to the user information are obtained through the user information, the transaction behavior data are classified through the preset classification numbers, the contour coefficient is calculated according to the classification result corresponding to each preset classification number, the target classification number is screened out according to the contour coefficient, the target classification number is the optimal target classification number, finally, the user characteristic value corresponding to each category is calculated according to the transaction behavior data included in each category under the target classification number, user grading is achieved through the user characteristic value, the obtained user grading can accurately evaluate the importance degree of the user, the accuracy of user classification is improved, and accurate services and products conforming to user positioning can be provided for users in the financial field and the trust field.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, without limitation to the disclosure. The above and other features and advantages will become more readily apparent to those skilled in the art by describing in detail exemplary embodiments with reference to the attached drawings, in which:

fig. 1 is an application scenario diagram of a user grading method and apparatus provided in an embodiment of the present disclosure;

FIG. 2 is a flow chart of a user ranking method provided by an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of a classification process according to an embodiment of the disclosure;

FIG. 4 is a graph showing the correspondence between preset classification numbers and profile coefficients according to an embodiment of the present disclosure;

FIG. 5 is a flow chart of a transaction behavior data missing value process provided by an embodiment of the present disclosure;

FIG. 6 is a flow chart of a process for transaction behavior data outliers provided by an embodiment of the present disclosure;

FIG. 7 is a flow chart of a transaction behavior data normalization process provided by an embodiment of the present disclosure;

FIG. 8 is a schematic diagram of user grading in a CRM system providing the trust industry in accordance with an embodiment of the present disclosure;

FIG. 9 is a block diagram of a user grading device provided by an embodiment of the present disclosure;

Fig. 10 is a block diagram of an electronic device according to an embodiment of the present disclosure.

Detailed Description

For a better understanding of the technical solutions of the present disclosure, exemplary embodiments of the present disclosure will be described below with reference to the accompanying drawings, in which various details of the embodiments of the present disclosure are included to facilitate understanding, and they should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Embodiments of the disclosure and features of embodiments may be combined with each other without conflict.

As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order colloquial is not violated.

In the related art, an RFM model is generally adopted when the user classification problem is solved, and three indexes of the RFM model are respectively as follows: the last time the user consumed, the frequency of consumption and the amount of consumption. The RFM model mainly focuses on user liveness, consumption capability and loyalty, screens users according to specific requirements, and judges user values. The conventional method for grading RFM is two, and one method is to calculate the weight of each user to finally obtain the comprehensive score of the user. The division is performed according to the size of the user score. In the other mode, each column of indexes is divided into a high range and a low range by threshold value division in the RFM indexes, each column of indexes is respectively marked as '1' and '0', eight different types of user classifications can be formed through combination of three indexes, so that an RFM model based on a feature classification method is obtained, and finally a subdivision rule is established as follows, wherein '111' is an important value user; "101": important developing users; "011": important reserved users; "110": ordinary value users; "001" important withholding users; "100" is general developing users; "010": general users; "000" is the normal user of the stay.

It can be seen that although the method of classifying RFMs is simple and easy to performIt is also not difficult to find that the client classification level exhibits an exponential level increase as the index condition increases. Assuming that the user-selectable feature is n, the number of final customer grades obtained by the above RFM grading method is 2 ⁿ To the power. It is apparent that the practical operability of such a hierarchy is greatly reduced under the big data age. If a new feature variable is added on the basis of the original RFM model, 16 different client groups are finally generated by adopting the grading mode, and obviously, the grading result cannot be practically applied when the client relationship is managed.

Fig. 1 schematically illustrates an application scenario diagram of a user ranking method and apparatus provided by an embodiment of the present disclosure.

As shown in fig. 1, an application scenario of an embodiment of the present disclosure may include a terminal device 101, a network 103, and a server 102. The network 103 is a medium used to provide a communication link between the terminal device 101 and the server 102. The network 103 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

A user may interact with the server 102 via the network 103 using the terminal device 101 to receive or send messages or the like. Various communication client applications, such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc., may be installed on the terminal device 101 (by way of example only).

The terminal device 101 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.

The server 102 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by the user using the terminal device 101. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device.

It should be noted that the user ranking method and apparatus provided in the embodiments of the present disclosure may be performed by the server 102. Accordingly, the user ranking method and apparatus provided by the embodiments of the present disclosure may be provided in the server 102. The user ranking method and apparatus provided by the embodiments of the present disclosure may also be performed by a server or cluster of servers other than server 102 and capable of communicating with terminal device 101 and/or server 102. Accordingly, the user ranking method and apparatus provided by the embodiments of the present disclosure may also be provided in a server or server cluster that is different from the server 102 and is capable of communicating with the terminal device 101 and/or the server 102.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Fig. 2 is a flowchart of a user classification method according to an embodiment of the present disclosure. Referring to fig. 2, the method includes:

step S201, a plurality of user information and a plurality of transaction behavior data corresponding to the plurality of user information are obtained, wherein the transaction behavior data corresponding to each user information is generated according to behavior feature data of a plurality of dimensions corresponding to the user information.

In the disclosed embodiments, user information refers to information describing the identity of a user, including, but not limited to, name, age, gender, unique identity, and the like. The user information may be obtained through any existing public information base or customer management system, and it should be noted that the user information is obtained under the condition of user permission. By way of example, the user information may be expressed as "Zhang San, 25 years old, man, gramineae".

The transaction behavior data refers to related data generated by a user performing transaction type operations on various entity products or virtual products, wherein the virtual products can be financial products related to financial management, such as stocks, bonds, funds, virtual currency, and the like. The transaction class operations include: the operations of buying, selling, etc. are performed using cash, various bank cards, network payment, etc. Transaction performance data includes, but is not limited to, time of occurrence, amount of transaction, number of transactions, etc. of certain transaction type operations. Transaction performance data may be obtained by trusted authorities 'business software, systems, bank flow records, or transaction record databases, with the user's authorized permissions being verified.

Step S202, classifying the transaction behavior data through a plurality of different preset classification numbers to obtain classification results corresponding to each preset classification number, wherein the preset classification numbers are used for representing the number of categories contained in the classification results.

The preset classification number refers to the number of categories included in a preset classification result, and the plurality of different preset classification numbers refer to two or more preset classification numbers. The number and number of the preset classification numbers are only for illustration, and should not be construed as limiting the embodiments of the disclosure, and specific values of the preset classification numbers in the specific implementation process may be set with reference to user requirements, the transaction behavior data scale to be classified, and the like.

Classification processing refers to merging data that has some common attributes or features, i.e., distinguishing the data by its common attributes or features. Classification process implementations include, but are not limited to, data classification models, data classification algorithms, in particular, classification processes may be implemented by K-Means clustering, FCM clustering algorithms, KNN nearest neighbor algorithms, machine learning models.

In a specific implementation process, the use mode of the plurality of preset classifications is that each preset classification number is used as the expected category number of the classification processing, so that the multiple classification processing is performed to obtain a classification result corresponding to each preset classification number; for example, it may be assumed that the number of the preset classifications is two, four, and six, and three hundred transaction behavior data to be classified are respectively provided, so that three classification processing operations need to be performed respectively at this time, three hundred transaction behavior data need to be classified into two classes when the first classification processing is performed, three hundred transaction behavior data need to be classified into four classes when the second classification processing is performed, three hundred transaction behavior data need to be classified into six classes when the third classification processing is performed, and a classification result obtained by performing the classification processing each time is the transaction behavior data included in each class. It should be noted that, the scale of the transaction behavior data and the order of the preset classification numbers used in the classification processing according to the embodiments of the present disclosure are only used for illustration, and should not be construed as limiting the embodiments of the present disclosure.

Step S203, calculating a contour coefficient according to the classification result corresponding to each preset classification number, and selecting a target classification number from a plurality of different preset classification numbers based on the contour coefficient.

The profile coefficient (Silhouette Coefficient) is a parameter for evaluating the quality of classification effect, and the parameter can be used for evaluating the influence of different classification algorithms or different running modes of the classification algorithms on the classification result on the basis of the same original data. The contour coefficients are in one-to-one correspondence with the data to be classified, and one contour coefficient corresponds to one data to be classified.

The target classification number is a preset classification number which is selected from a plurality of preset classification numbers and meets the condition, wherein the meeting condition is that the profile coefficient corresponding to each transaction behavior data under the selected preset classification number is optimal or the profile coefficient is in a set range. It is understood that the number of target classifications may be one or more, and the number of target classifications does not exceed the total number of preset classifications.

The selecting the target classification number from a plurality of different preset classification numbers based on the contour coefficient refers to taking the contour coefficient as an index for evaluating the preset classification number, for example, a contour range can be preset, then the optimal preset classification number is selected by using the data quantity falling in the contour range, and the optimal preset classification number can be selected by parameters such as a contour coefficient mean value, a median and the like corresponding to the classification result of each preset classification number.

Step S204, calculating a user characteristic value corresponding to each category through transaction behavior data in each category under the target classification number, and determining a user grade corresponding to each category based on the user characteristic value corresponding to each category.

The user characteristic value is personal characteristic data used for representing the correlation of the user and the transaction behavior, and one user characteristic value is obtained by calculating the transaction behavior data of the user, wherein the transaction behavior data used for calculating the user characteristic value can be all-dimensional behavior characteristic data, or partial-dimensional behavior characteristic data, or combination of the behavior characteristic data and user information data.

User ranking refers to dividing a user into a number of tiers according to user characteristic values. Specifically, the hierarchical division mode may adopt a user numerical interval setting, for example, the user characteristic values of each category may be divided into a plurality of intervals, and a corresponding user hierarchy is set for each interval; the hierarchical division may also be divided with reference to a specific ranking result of the user feature values corresponding to the respective categories, for example, according to the user feature value ranking result, it may be specified that the user level ranked in front is higher than the user level ranked in back.

According to the embodiment of the disclosure, the importance degree of the user can be accurately evaluated, the accuracy of user classification is improved, and the financial field and the trust field can be assisted to provide accurate services and products conforming to the user positioning for the user.

In some possible implementations, fig. 3 illustrates a classification processing flow provided by an embodiment of the present disclosure, referring to fig. 3, a classification processing operation may be implemented by using a k-means algorithm, and step S202 described above, in which classification processing is performed on a plurality of transaction behavior data by a plurality of different preset classification numbers, to obtain a classification result corresponding to each preset classification number, may include:

selecting a preset classification number from a plurality of different preset classification numbers in a traversing manner, and executing the following operations according to the selected preset classification number:

step S301, randomly selecting K transaction behavior data from a plurality of transaction behavior data as a clustering center of a K-means algorithm, wherein K is a preset classification number selected at the time.

Random selection refers to selecting K transaction data from a plurality of transaction data, wherein the manner in which random selection is implemented includes, but is not limited to, simple random sampling, hierarchical sampling, whole group sampling.

Step S302, calculating the distance between each transaction behavior data and each clustering center according to a plurality of transaction behavior data, and distributing each transaction behavior data to the clustering center corresponding to the minimum distance, wherein the distance comprises any one of Euclidean distance, manhattan distance, cosine similarity and local density.

Step S303, after each transaction behavior data is distributed to the cluster center corresponding to the minimum distance, calculating the average value of the transaction behavior data included in each cluster center.

Step S304, judging whether the average value of the transaction behavior data included in each cluster center is matched with the corresponding cluster center.

The judging condition of whether the average value of the transaction behavior data is matched with the corresponding clustering center includes, but is not limited to, whether the average value of the transaction behavior data is identical to the clustering center or whether the difference value of the average value of the transaction behavior data and the clustering center exceeds a preset threshold value. For example, if the mean value of the transaction behavior data is the same as the clustering center, the mean value of the transaction behavior data is matched with the corresponding clustering center, or if the difference value of the mean value of the transaction behavior data and the clustering center does not exceed a preset threshold value, the mean value of the transaction behavior data is matched with the corresponding clustering center.

Step S305, in response to the fact that the average value of the transaction behavior data included in any cluster center is not matched with the corresponding cluster center, updating the corresponding cluster center by using the average value of the transaction behavior data, repeatedly executing step S302, respectively calculating the distance between each transaction behavior data and each cluster center for a plurality of transaction behavior data, and distributing each transaction behavior data to the cluster center corresponding to the minimum distance until the average value of the transaction behavior data included in each cluster center is matched with the corresponding cluster center.

According to the embodiment of the disclosure, the K-means algorithm is used for clustering transaction behavior data, so that the transaction behavior data of a plurality of users can be clustered according to the set K value number rapidly and accurately, the method has the advantages of low algorithm complexity and fast convergence, is suitable for most transaction behavior data, and has better universality.

In some possible implementations, the step S203, which calculates the contour coefficient according to the classification result corresponding to each preset classification number, and selects the target classification number from the plurality of different preset classification numbers based on the contour coefficient, may include:

and calculating the contour coefficient of each transaction behavior data under each preset classification number according to a formula I.

SC (i) = (b (i) -d (i))/(max (b (i), d (i))) formula one;

wherein SC (i) represents a contour coefficient corresponding to the ith transaction behavior data in any cluster center, b (i) represents an average distance between the ith transaction behavior data in any cluster center and the rest of the transaction behavior data in any cluster center, and d (i) represents an average distance between the ith transaction behavior data in any cluster center and each transaction behavior data in adjacent cluster centers of any cluster center.

It should be noted that, the adjacent cluster center of any cluster center refers to the cluster center closest to the any cluster center, and the specific closest cluster center meets the following definition:

wherein p is any transaction data in a cluster center, simply speaking, X _i After the average distance of all transaction behavior data to a certain clustering center is used as the distance for measuring the transaction behavior data to the clustering center, the distance X is selected _i The nearest cluster center serves as the nearest cluster center.

And respectively calculating the average value of all the contour coefficients under each preset classification number.

The contour coefficient is corresponding to each transaction behavior data, the average value of the contour coefficient under each preset classification number is obtained by respectively obtaining the contour coefficient sum of each transaction behavior data under each preset classification number, then comparing the contour coefficient sum corresponding to each preset classification number with the transaction behavior data to obtain the average value of the contour coefficient, and the average value of the contour coefficient corresponding to each preset classification number one by one can be obtained through calculation.

And taking a preset classification number corresponding to the average value maximum value of the contour coefficient as a target classification number.

For example, referring to fig. 4, fig. 4 shows a corresponding relationship curve between the preset classification numbers and the contour coefficients provided by the embodiment of the present disclosure, in fig. 4, the preset classification numbers are eight, the values of the eight preset classification numbers are integers from two to nine, and it can be seen that when the preset classification numbers are equal to two, the contour coefficient value is the largest (the contour coefficient is approximately equal to 0.66), and at this time, the preset classification number equal to two can be used as the target classification number. It should be noted that the respective preset classification numbers shown in fig. 4, and the profile coefficient values corresponding to each preset classification number are only for illustration, and should not be construed as limiting the embodiments of the present disclosure.

According to the embodiment of the disclosure, the k-means algorithm clustering effect is evaluated by using the contour coefficients, and the optimal preset classification number is screened out by using the average value of the largest contour coefficients, so that the classification number is not dependent on the number of evaluation indexes, the obtained target classification number is more reasonable, an accurate classification reference is provided for actual management users, and the method has good practicability.

In some possible implementations, fig. 5 shows a process flow of transaction behavior data missing values provided by an embodiment of the present disclosure, and referring to fig. 5, the embodiment of the present disclosure further provides a processing manner of the transaction behavior data missing values, where the missing value processing operation precedes the classification processing operation, and specifically includes:

Step S401, selecting a dimension from a plurality of dimensions of transaction behavior data in a traversal manner as a first target dimension.

Step S402, judging whether behavior feature data corresponding to a first target dimension exists in each transaction behavior data.

The presence or absence of the behavior feature data refers to whether the position corresponding to the behavior feature data has data meeting requirements, and generally when the data does not exist, the position corresponding to the behavior feature data does not have data or uses specific character filling (such as naii), if the position corresponding to the position does not have data, the situation can be achieved by judging whether the data is numerical, and if the situation using specific character filling can be achieved by judging whether the data is equal to the specific character.

And step S403, assigning the behavior feature data corresponding to the first target dimension by adopting a first preset assignment rule in response to the absence.

The first preset assignment rule includes, but is not limited to, replacing the first preset assignment rule by using a mean value of the behavior feature data of all users in one dimension, and of course, replacing the first preset assignment rule by using a median value of the behavior feature data of all users in one dimension, and replacing the first preset assignment rule by using a custom numerical value. If there is no missing value, the data does not need to be processed.

According to the embodiment of the disclosure, the missing value search is performed on the feature data of each user in each dimension in advance before the classification processing is performed on the transaction behavior data, and the searched missing values are filled, so that the data are kept complete when the transaction behavior data are classified, and the accuracy of classification of the transaction behavior data can be further improved.

In some possible implementations, fig. 6 shows a process flow of transaction behavior data outliers provided by an embodiment of the disclosure, and referring to fig. 6, the embodiment of the disclosure further provides a processing manner of the transaction behavior data outliers, where the outlier processing operation precedes the classification processing operation, and specifically includes

Step S501, selecting a dimension from a plurality of dimensions of transaction behavior data in a traversal manner as a second target dimension.

Step S502, calculating standard deviation of behavior feature data corresponding to the second target dimension in all transaction behavior data.

Step S503, judging whether the ratio of the behavior characteristic data corresponding to the second target dimension in each transaction behavior data to the standard deviation exceeds a preset multiple.

The preset multiple may be any number, and in some preferred embodiments the preset multiple may be set to three times.

And step S504, in response to exceeding the preset multiple, adopting a second preset assignment rule to assign values to the behavior feature data corresponding to the second target dimension.

Wherein the second preset assignment rule includes, but is not limited to, replacing with a mean value of the behavior feature data of all users in one dimension, replacing with a median value of the behavior feature data of all users in one dimension; in some preferred embodiments, the second preset assignment rule is to replace data exceeding three standard deviations with three standard deviations.

According to the embodiment of the disclosure, the abnormal data is searched for the characteristic data of each user in each dimension in advance before the transaction behavior data is classified, and the abnormal data is repaired by using the specific rule when the abnormal data is found, so that the data is accurate and reliable when the transaction behavior data is classified, and the accuracy of classification of the transaction behavior data can be further improved.

In some possible implementations, considering that the transaction behavior data index is not single, the transaction behavior data index is often formed by a plurality of behavior feature data, and the behavior feature data has different attributes, orders of magnitude and units, so that various subsequent operations such as comparison, weighting, summation and the like cannot be performed on different behavior feature data, and in order to eliminate the difference between different behavior features and the standard of unified comparison, the normalization of the data is important. Fig. 7 shows a transaction behavior data standardization processing flow provided by an embodiment of the present disclosure, and referring to fig. 7, the embodiment of the present disclosure further provides a transaction behavior data standardization processing manner, where the standardization processing operation precedes the classification processing operation, and specifically includes:

Step S601, selecting a dimension from a plurality of dimensions of transaction behavior data in a traversal manner as a third target dimension.

Step S602, calculating the mean value and standard deviation of the behavior feature data corresponding to the third target dimension in all the transaction behavior data.

Step S603, calculating the standard value of the behavior feature data corresponding to the third target dimension in each transaction behavior data according to the formula II.

Wherein x is ^* Representing a standard value, x represents an original value of behavior characteristic data corresponding to a third target dimension in any one transaction behavior data,and (3) representing the mean value of the behavior feature data corresponding to the third target dimension in all the transaction behavior data, and std (x) represents the standard deviation of the behavior feature data corresponding to the third target dimension in all the transaction behavior data.

For the sake of convenience in understanding the execution of the normalization process, the following will describe in detail the normalization process of feature data of five users in three dimensions, table 1 shows the raw data before the normalization process, and it can be seen from table 1 that the units and the expression modes of each dimension in the age dimension, the sex dimension and the learning dimension are completely different, even if the data in the sex dimension is expressed by a numerical value (for example, 0 is used for men and 1 is used for women), the data in the sex dimension is far smaller than the data in the age dimension, and the similar data in the age dimension has the same problems as those in the learning dimension.

Table 1 raw data before normalization

	Age of	Sex (sex)	Learning calendar
				User 1	25	Man's body	Gramineae (Gramineae)
User 2	40	Female	Empty space
				User 3	45	Man's body	Study life
User 4	36	Man's body	Gramineae (Gramineae)
				User 5	29	Female	Study life

According to the embodiment of the disclosure, the data in the gender dimension and the data in the academic dimension are respectively converted into the numerical values by adopting the set rules, and the change range of the data in the gender dimension and the data in the academic dimension is not greatly different from the change range of the data in the academic dimension between 0 and 2, so that the data in the age dimension is converted by adopting the formula II, and the standardized processed data shown in the table 2 is obtained.

Table 2 normalized data

	Age of	Sex (sex)	Learning calendar
				User 1	-1.381	0	1
User 2	0.690	1	0
				User 3	1.381	0	2
User 4	0.138	0	1
				User 5	-0.828	1	2

Referring to table 2, it can be seen that the data in the age dimension after the normalization process in table 2 has a value close to the value range of the data in the gender dimension and the data in the academic dimension, i.e., the data in each dimension after the data normalization process is performed has comparability. It should be noted that, specific dimensions related to the normalization process in the embodiments of the present disclosure, and specific numerical values of data of each dimension are only used for illustration, and should not be construed as limiting the embodiments of the present disclosure.

According to the embodiment of the disclosure, the characteristic data of each user in each dimension is subjected to data standardization processing in advance before the transaction behavior data is subjected to classification processing, so that the differences of the behavior characteristic data in different dimensions in attributes, orders of magnitude and units can be eliminated, various subsequent operations such as comparison, weighting and summation can be performed on the behavior characteristic data in different dimensions, and the accuracy of classification of the transaction behavior data can be further improved.

In some possible implementations, the step S204, calculating the user feature value corresponding to each category through the transaction behavior data in each category under the target classification number, and determining the user classification corresponding to each category based on the user feature value corresponding to each category, may include:

respectively calculating the mean value of the behavior characteristic data of each dimension in each category;

the average value of the behavior characteristic data of each dimension is weighted according to the preset weight corresponding to each dimension, and a user characteristic value corresponding to each category is obtained;

the weight corresponding to each dimension may be determined by any existing weight determining manner, for example, determining the weight by filling expert questionnaire or calculating the weight by analytic hierarchy process, which is not limited by the embodiment of the present disclosure;

Sorting a plurality of user characteristic values corresponding to a plurality of categories under the target classification number according to a preset order;

the preset sequence can be ascending, namely, the preset sequence is ordered from small to large according to the user characteristic values, and the preset sequence can also be descending, namely, the preset sequence is ordered from large to small according to the user characteristic values;

determining the user classification corresponding to each category according to the ordering result of the user characteristic values corresponding to each category;

the user classification may be represented by a specific numerical value, for example, a numerical value from 0 to N (N is a positive integer greater than or equal to 1), and of course, the user classification may also be represented by a specific character, for example, an ordinary client, an important development client, an ordinary development client, an important client, and the like.

According to the embodiment of the disclosure, when the user characteristic value of each category under the target classification is calculated, the behavior characteristic data of each user in each dimension under each category is comprehensively considered, so that each category can be comprehensively evaluated, the user classification quantification is realized by the user characteristic value corresponding to each category, and the user classification accuracy is remarkably improved.

In some possible implementations, in order to improve accuracy and practicality of trust user classification, the embodiments of the present disclosure improve an existing RFM model, and in combination with trust industry user characteristics, it is proposed to construct a user RFMV model using the following four indexes, which are respectively: the last time of consumption (R), the Frequency of consumption (F), the amount of consumption (monetari, M), the user's assets (Value, V). FIG. 8 shows a schematic view of user classification in a CRM system in the trusted industry, and with reference to FIG. 8, a user classification method for the trusted industry will be described in detail, and the specific implementation procedure is referred to as follows:

Firstly, acquiring original data, wherein for the collection of user data and product data original data in a CRM system, one part of the original data can directly acquire data, and the other part of user behavior data needs to be acquired by docking the CRM with an app system of a trusted company;

then, processing the acquired original data, and carrying out data preprocessing on the acquired original data, wherein the preprocessing comprises the steps of sequentially executing filling of blank values, modification of abnormal values and data standardization;

then classifying the data, and taking the preprocessed data into a K-means algorithm to obtain classification results of users under different K values;

after clustering, analyzing the classification result, calculating the statistical index profile coefficient of the classification result, and drawing a relation diagram between the profile coefficient and a clustering K value;

selecting an optimal profile coefficient through a relation diagram between the profile coefficient and a clustering K value, and calculating the characteristic value of each type of user under the classification result;

finally, according to different user characteristic values, defining various classification labels by expert groups for different categories, for example, determining various user classification labels from low to high according to the user characteristic values can be common clients, important development clients, common development clients, important clients and the like.

According to the embodiment of the disclosure, the user grading method solves the problem that the CRM system in the trust industry cannot effectively grade the customer groups, and can accurately and efficiently grade the users in the trust industry; the labor cost of the trust company is maximized, and the energy and time required to be input by users of different levels can be synchronously adjusted, so that the time input-output ratio is maximized; the obtained user grading can assist the trust industry in constructing user portraits, provide differentiated services for users and improve user experience.

It will be appreciated that the above-mentioned method embodiments of the present disclosure may be combined with each other to form a combined embodiment without departing from the principle logic, and are limited to the description of the present disclosure. It will be appreciated by those skilled in the art that in the above-described methods of the embodiments, the particular order of execution of the steps should be determined by their function and possible inherent logic.

In addition, the disclosure further provides a user grading device, an electronic device, and a computer readable storage medium, where the foregoing may be used to implement any one of the user grading methods provided in the disclosure, and corresponding technical schemes and descriptions and corresponding descriptions referring to method parts are not repeated.

Fig. 9 is a block diagram of a user grading apparatus provided in an embodiment of the present disclosure.

Referring to fig. 9, an embodiment of the present disclosure provides a user grading apparatus, including:

an obtaining module 801, configured to obtain a plurality of user information and a plurality of transaction behavior data corresponding to the plurality of user information, where the transaction behavior data corresponding to each user information is generated according to behavior feature data of a plurality of dimensions corresponding to the user information;

the classification module 802 is configured to perform classification processing on the transaction behavior data through a plurality of different preset classification numbers, to obtain a classification result corresponding to each preset classification number, where the preset classification number is used to characterize the number of categories included in the classification result;

a selecting module 803, configured to calculate a contour coefficient according to a classification result corresponding to each preset classification number, and select a target classification number from a plurality of different preset classification numbers based on the contour coefficient;

a determining module 804, configured to calculate a user feature value corresponding to each category through the transaction behavior data in each category under the target classification number, and determine a user ranking corresponding to each category based on the user feature value corresponding to each category.

The various modules in the user hierarchy described above may be implemented in whole or in part in software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

Referring to fig. 10, an embodiment of the present disclosure provides an electronic device including: at least one processor 901; at least one memory 902, and one or more I/O interfaces 903, connected between the processor 901 and the memory 902; wherein the memory 902 stores one or more computer programs executable by the at least one processor 901, the one or more computer programs being executable by the at least one processor 901 to enable the at least one processor 901 to perform the user grading method described above.

The various modules in the electronic device described above may be implemented in whole or in part in software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

The disclosed embodiments also provide a computer readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor/processing core, implements the user grading method described above. The computer readable storage medium may be a volatile or nonvolatile computer readable storage medium.

Embodiments of the present disclosure also provide a computer program product comprising computer readable code, or a non-transitory computer readable storage medium carrying computer readable code, which when executed in a processor of an electronic device, performs the user grading method described above.

Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer-readable storage media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).

The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable program instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, random Access Memory (RAM), read Only Memory (ROM), erasable Programmable Read Only Memory (EPROM), static Random Access Memory (SRAM), flash memory or other memory technology, portable compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical disc storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable program instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and may include any information delivery media.

The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.

Computer program instructions for performing the operations of the present disclosure can be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.

The computer program product described herein may be embodied in hardware, software, or a combination thereof. In an alternative embodiment, the computer program product is embodied as a computer storage medium, and in another alternative embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), or the like.

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purpose of limitation. In some instances, it will be apparent to one skilled in the art that features, characteristics, and/or elements described in connection with a particular embodiment may be used alone or in combination with other embodiments unless explicitly stated otherwise. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure as set forth in the appended claims.

Claims

1. A method of user classification, comprising:

2. The user classification method according to claim 1, wherein the classifying the plurality of transaction behavior data by a plurality of different preset classification numbers, respectively, to obtain classification results corresponding to each preset classification number, comprises:

randomly selecting K transaction behavior data from the transaction behavior data as a clustering center of a K-means algorithm, wherein K is a preset classification number selected at the time;

for the transaction behavior data, calculating the distance between each transaction behavior data and each clustering center, and distributing each transaction behavior data to the clustering center corresponding to the minimum distance, wherein the distance comprises any one of Euclidean distance, manhattan distance, cosine similarity and local density;

After each transaction behavior data is distributed to the cluster center corresponding to the minimum distance, calculating the average value of the transaction behavior data included in each cluster center;

judging whether the average value of transaction behavior data included in each clustering center is matched with the corresponding clustering center;

and in response to the fact that the average value of the transaction behavior data included in any clustering center is not matched with the corresponding clustering center, updating the corresponding clustering center by using the average value of the transaction behavior data, repeatedly executing the steps of respectively calculating the distance between each transaction behavior data and each clustering center aiming at the transaction behavior data, and distributing each transaction behavior data to the clustering center corresponding to the minimum distance until the average value of the transaction behavior data included in each clustering center is matched with the corresponding clustering center.

3. The user ranking method of claim 2, wherein calculating a profile coefficient from the classification result corresponding to each of the preset classification numbers and selecting a target classification number from a plurality of different preset classification numbers based on the profile coefficient comprises:

calculating the contour coefficient of each transaction behavior data under each preset classification number according to the following formula I;

SC (i) = (b (i) -d (i))/(max (b (i), d (i))) formula one;

wherein SC (i) represents a contour coefficient corresponding to ith transaction behavior data in any cluster center, b (i) represents an average distance between the ith transaction behavior data in any cluster center and the rest of transaction behavior data in any cluster center, and d (i) represents an average distance between the ith transaction behavior data in any cluster center and each transaction behavior data in adjacent cluster centers of any cluster center;

respectively calculating the average value of all profile coefficients under each preset classification number;

4. The user grading method according to claim 1, wherein before the step of classifying the plurality of transaction behavior data by a plurality of different preset classification numbers, the method further comprises:

selecting one dimension from a plurality of dimensions of transaction behavior data in a traversing manner as a first target dimension;

judging whether behavior characteristic data corresponding to the first target dimension exists in each transaction behavior data;

and if the behavior feature data does not exist, assigning the behavior feature data corresponding to the first target dimension by adopting a first preset assignment rule.

5. The user grading method according to claim 1, wherein before the step of classifying the plurality of transaction behavior data by a plurality of different preset classification numbers, the method further comprises:

selecting one dimension from a plurality of dimensions of transaction behavior data in a traversal manner as a second target dimension;

calculating standard deviation of behavior feature data corresponding to the second target dimension in all transaction behavior data;

judging whether the ratio of the behavior characteristic data corresponding to the second target dimension in each transaction behavior data to the standard deviation exceeds a preset multiple or not;

and in response to the fact that the preset multiple is exceeded, adopting a second preset assignment rule to assign values to the behavior feature data corresponding to the second target dimension.

6. The user grading method according to claim 1, wherein before the step of classifying the plurality of transaction behavior data by a plurality of different preset classification numbers, the method further comprises:

selecting one dimension from a plurality of dimensions of transaction behavior data in a traversal manner as a third target dimension;

calculating the mean value and standard deviation of the behavior feature data corresponding to the third target dimension in all the transaction behavior data;

Calculating a standard value of behavior characteristic data corresponding to the third target dimension in each transaction behavior data according to the following formula II;

wherein x is ^* And representing a standard value, wherein x represents an original value of behavior feature data corresponding to a third target dimension in any one transaction behavior data, x represents a mean value of behavior feature data corresponding to the third target dimension in all transaction behavior data, and std (x) represents a standard deviation of behavior feature data corresponding to the third target dimension in all transaction behavior data.

7. The user ranking method of claim 1, wherein calculating the user feature value corresponding to each category from the transaction behavior data in each category under the target classification number and determining the user ranking corresponding to each category based on the user feature value corresponding to each category comprises:

And determining the user classification corresponding to each category according to the ordering result of the user characteristic values corresponding to each category.

8. The user ranking method of any one of claims 1 to 7, wherein the plurality of dimensions of behavioral characteristic data comprises: the last time the consumption was, the frequency of consumption, the amount of consumption, and the user's assets.

9. A user grading apparatus, comprising:

10. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores one or more computer programs executable by the at least one processor to enable the at least one processor to perform the user grading method according to any of claims 1-8.

11. A computer readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the user grading method according to any of claims 1-8.