WO2023279696A1 - 业务风险客群的识别方法、装置、设备及存储介质 - Google Patents

业务风险客群的识别方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2023279696A1
WO2023279696A1 PCT/CN2022/071685 CN2022071685W WO2023279696A1 WO 2023279696 A1 WO2023279696 A1 WO 2023279696A1 CN 2022071685 W CN2022071685 W CN 2022071685W WO 2023279696 A1 WO2023279696 A1 WO 2023279696A1
Authority
WO
WIPO (PCT)
Prior art keywords
customer
customer group
business risk
business
group
Prior art date
Application number
PCT/CN2022/071685
Other languages
English (en)
French (fr)
Inventor
王遥
朱旭音
张霖
赵天骄
贾素苇
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2023279696A1 publication Critical patent/WO2023279696A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Definitions

  • the present application relates to the field of intelligent decision-making of artificial intelligence, in particular to a method, device, equipment and storage medium for identifying business risk customer groups.
  • the current high-risk customer group identification method is generally based on the modeling of the overall customer group to obtain the customer group identification model, and then make business judgments on the overall customer group based on the customer group identification model, and define the prediction value of the business judgment to be higher than a certain threshold , or the predicted value of the business judgment ranks the top customer group as the target customer (that is, the high-risk customer group).
  • the recognition accuracy is low.
  • the present application provides a business risk customer group identification method, device, equipment and storage medium, which are used to improve the accuracy of business risk customer group identification based on a large number of customer groups.
  • the first aspect of this application provides a method for identifying business risk customer groups, including:
  • the initial business risk customer group is screened to obtain a target business risk customer group.
  • the second aspect of the present application provides an identification device for a business risk customer group, including a memory, a processor, and computer-readable instructions stored on the memory and operable on the processor, and the processor executes the When the computer-readable instructions are described, the following steps are implemented:
  • the initial business risk customer group is screened to obtain a target business risk customer group.
  • the third aspect of the present application provides a computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when the computer instructions are run on the computer, the computer is made to perform the following steps:
  • the initial business risk customer group is screened to obtain a target business risk customer group.
  • the fourth aspect of the application provides an identification device for business risk customer groups, including:
  • the classification module is used to obtain customer information, business variables and experience classification factors of the customer groups to be processed, and classify the customer information of the customer groups to be processed through the experience classification factors to obtain experience-classified customer groups;
  • the first prediction module is used to call the decision tree rule subdivided customer group model corresponding to the business variable, perform business risk customer group prediction on the customer information of the customer group to be processed, and obtain the decision tree classified customer group;
  • the merging module is used to merge the customer group classified by experience and the customer group classified by decision tree to obtain the initial business risk customer group;
  • the second prediction module is used to obtain the business risk customer group information of the initial business risk customer group, call the target prediction model corresponding to the initial business risk customer group, and perform business risk prediction on the business risk classification customer group information , get the business risk prediction value;
  • the screening module is used to screen the initial business risk customer group according to the business risk prediction value to obtain the target business risk customer group.
  • the customer information, business variables, and experience classification factors of the customer groups to be processed are obtained, and the customer information of the customer groups to be processed is classified by the experience classification factors to obtain the experience-classified customer information.
  • Group call the decision tree rule subdivided customer group model corresponding to the business variable, carry out business risk customer group prediction on the customer information of the customer group to be processed, and obtain the decision tree classification customer group; classify the experience customer group Merge with the decision tree classification customer group to obtain the initial business risk customer group; obtain the business risk customer group information of the initial business risk customer group, call the target prediction model corresponding to the initial business risk customer group, and Based on the business risk classification customer group information, business risk prediction is performed to obtain a business risk prediction value; and the initial business risk customer group is screened based on the business risk prediction value to obtain a target business risk customer group.
  • the coverage rate of the target business risk customer group is ensured by classifying the diverse customer groups to be processed, and the customers with higher business risks within the group are found through the target prediction model, which ensures that the target prediction model is Accuracy of capturing target business risk customer groups, through empirical classification factors and decision tree rules to subdivide customer group model to classify customer information of the customer group to be processed, improves the accuracy of the initial business risk customer group, and realizes
  • the accuracy of identification of high-risk customer groups that is, target business risk customer groups
  • the accuracy of identification of high-risk customer groups is guaranteed, thereby improving the accuracy of business risk customer group identification based on a large number of customer groups.
  • Fig. 1 is a schematic diagram of an embodiment of a method for identifying business risk customer groups in the embodiment of the present application
  • FIG. 2 is a schematic diagram of another embodiment of the method for identifying business risk customer groups in the embodiment of the present application
  • FIG. 3 is a schematic diagram of an embodiment of an identification device for business risk customer groups in the embodiment of the present application
  • FIG. 4 is a schematic diagram of another embodiment of an identification device for business risk customer groups in the embodiment of the present application.
  • Fig. 5 is a schematic diagram of an embodiment of an identification device for a business risk customer group in the embodiment of the present application.
  • Embodiments of the present application provide a method, device, device, and storage medium for identifying business-risk customer groups, which improve the accuracy of identifying business-risk customer groups based on a large number of customer groups.
  • an embodiment of the method for identifying business risk customer groups in the embodiment of the application includes:
  • the subject of execution of this application may be an identification device of a business risk customer group, and may also be a terminal or a server, which is not specifically limited here.
  • the embodiment of the present application is described by taking the server as an execution subject as an example.
  • the number of experience-classified customer groups includes one or more than one.
  • Business variables include variables of nominal data and variables of numerical data. For example, taking the business of insurance claims as an example, the business variable is whether there is an accident (variable of nominal data), and the business variable is the actual amount of compensation (numeric data variables).
  • the empirical classification factor is the risk factor for judging business risk customers based on business common sense. The empirical classification factor is used to determine that the customer corresponding to a certain value range has a higher business risk based on the risk factor. For example: take the business of automobile liability insurance as an example. , the experience classification factor is the driver's age, gender, and historical accident records, etc., and the business is car damage insurance as an example, and the experience classification factor is the age, model, and date of manufacture of the car.
  • the server After the server is authorized by the customer, it extracts or grabs the customer information of the customer group to be processed.
  • the customer information of the customer group to be processed includes the customer's personal information, business information, and information corresponding to the experience classification factor; through the preset label extraction algorithm, Extract the label information of the customer information of the customer group to be processed to obtain the label information of the customer group to be processed; classify the label information of the customer group to be processed through the empirical classification factor to obtain the classification label information, and determine the customer group to be processed corresponding to the classification label information as The customer group is classified by experience, so as to realize the classification of the customer information of the customer group to be processed through the experience classification factor.
  • the decision tree classification customer group after business risk customer group classification includes business risk customer group and no business risk customer group
  • business risk customer group includes potential business risk customer group, low business risk customer group, medium business risk customer group and high business risk customers.
  • the decision tree rule segmentation customer group model utilizes the principle that the target of the bifurcation is the reduction of the mean square error, divides the customer group to be processed into several subsets with the closest values of business variables, and then identifies that the mean value of the business variables in the subset is significantly higher
  • the group that is based on the mean value of the overall business variable is marked as a subdivided customer group of bad customers (ie, a decision tree classification customer group).
  • the server invokes the decision tree rule segmenting the customer group model corresponding to the business variable.
  • the decision tree rule segmenting the customer group model is a decision tree classification model. Based on the preset dimension classification rules, Process the customer information of the customer group to classify the business risk customer group, and obtain the decision tree classification customer group; when the business variable is a variable of numerical data, the server calls the decision tree rule corresponding to the business variable to subdivide the customer group model, the decision tree
  • the rule segmented customer group model is a decision tree regression model.
  • the business variable regression is performed on the customer information of the customer group to be processed to obtain the business variable value, and compare the business variable value with the preset business risk threshold condition , and divide the pending customer group whose business variable value meets the business risk threshold condition into the business risk group corresponding to the business risk threshold condition, calculate the mean value of the business variable value of the business risk group, obtain the mean value of the group subset, and calculate all pending customer groups
  • the mean value of the business variable value of the customer group is obtained to obtain the comprehensive mean value, and the business risk group whose group subset mean value is greater than the comprehensive mean value is determined as the decision tree classification customer group; or, the server calculates and Threshold comparative analysis is performed to obtain analysis results, and the customer groups to be processed are classified according to the analysis results to obtain decision tree classified customer groups.
  • the server merges the experience-classified customer group and the decision tree-classified customer group into one set to obtain the initial business risk customer group, that is, the initial business risk customer group retains the original experience-classified customer group and the original decision tree-classified customer group; or , the server compares, analyzes and classifies the experience-classified customer group and the decision tree-classified customer group, and obtains the same customer group and the heterogeneous customer group.
  • Groups are different customer groups in the experience classification customer group and the decision tree classification customer group.
  • the similar customer groups are deduplicated and fused to obtain the fused customer group.
  • the fused customer group and the heterogeneous customer group are merged into one set. Get the initial business risk customer group.
  • the server obtains the business risk customer group type of the initial business risk customer group, and retrieves the target prediction model corresponding to the initial business risk customer group through the corresponding relationship between the pre-created business risk customer group type and the target prediction model; obtains the target forecast
  • the operation factor of the model, and the business risk customer group information of the initial business risk customer group, the operation factor is the business risk prediction index constructed based on the business variables and business needs, and the business risk customer group information is the customers of the classified customer group Information; call the target prediction model through the preset interface call address, and based on the operation factor, perform regression processing or classification processing based on business risk on the business risk classification customer group information to obtain the business risk prediction value.
  • the initial business risk customer group is screened to obtain the target business risk customer group.
  • the server sorts the customers in the initial business risk customer group in order of the business risk prediction value from large to small, and obtains the candidate business risk customer group sequence , to obtain the preset business risk range value, divide the pre-candidate business risk customer group sequence according to the business risk range value and business risk prediction value, and obtain the target business risk customer group; when the business risk prediction value is the business risk of each level When the probability value is higher, compare the business risk prediction value with the preset threshold value, and determine the customers whose business risk prediction value is greater than the preset threshold value in the initial business risk customer group as the candidate customer group, and treat them according to the size of the business risk prediction value.
  • the candidate customer groups are sorted in reverse order to obtain the sequence of candidate business risk customer groups, and the sequence of candidate business risk customer groups is sequentially selected according to the preset ratio to obtain the target business risk customer group.
  • the server sorts the customers in the initial business risk customer group according to the order of business risk prediction value from large to small to obtain the candidate business risk customer group sequence; based on the preset ratio, sequentially selects the candidate business risk customer group sequence , to get the target business risk customer group.
  • the coverage rate of the target business risk customer group is ensured by classifying the diverse customer groups to be processed, and the customers with higher business risks within the group are found through the target prediction model, which ensures that the target prediction model is Accuracy of capturing target business risk customer groups, through empirical classification factors and decision tree rules to subdivide customer group model to classify customer information of the customer group to be processed, improves the accuracy of the initial business risk customer group, and realizes
  • the accuracy of identification of high-risk customer groups that is, target business risk customer groups
  • the accuracy of identification of high-risk customer groups is guaranteed, thereby improving the accuracy of business risk customer group identification based on a large number of customer groups.
  • FIG. 2 Another embodiment of the method for identifying business risk customer groups in the embodiment of the present application includes:
  • the server acquires customer information sample sets and sample variables of historical customer groups, constructs business variables related to the sample variables, and obtains multiple related business variables; constructs multiple decision tree models according to multiple related business variables and sample variables, and Calculate the training mean improvement degree of the customer group leaf nodes of each decision tree model, and the customer group leaf nodes include the decision-making path; filter the decision-making path through the training mean improvement degree, and obtain the dimension classification rules of the historical customer group; through the historical customer group Dimensional classification rules to build multiple decision tree models of decision tree rules to subdivide the customer group model.
  • the server extracts or captures customer information sample sets and sample variables of historical customer groups, constructs business variables related to sample variables, obtains multiple related business variables, and integrates multiple related business variables into a variable list; Traverse multiple relevant business variables in the variable list to obtain each relevant business variable after traversal, construct a customer information sample set and a decision tree model of historical customer groups according to each relevant business variable and sample variable after traversal, and obtain multiple Each decision tree model corresponds to a relevant business variable.
  • Each decision tree model includes the initial customer group leaf customer node and the decision path corresponding to the initial customer group leaf customer node.
  • the initial customer group leaf customer node includes the historical customer group and For the corresponding customer information sample, the decision-making path includes variables and thresholds.
  • Min_samples_leaf The value of the minimum number of samples of leaf nodes (min_samples_leaf) can be appropriately lowered in the selection of model parameters.
  • the server calculates the training mean improvement degree of the initial customer group leaf node in each decision tree model customer group, and the training mean improvement degree is used to indicate that the mean value of the sample variable of the historical customer group in each initial customer group leaf node is higher than the sample variable of the overall sample
  • the server calculates the mean value of the sample variables in the leaf nodes that are finally split from the decision tree model, which is recorded as node_mean, and divides node_mean by the mean value of the overall sample variables to obtain the training mean improvement degree.
  • the server determines the target customer group leaf customer nodes of multiple decision tree models according to the training average promotion degree, and determines the decision path of the target customer group leaf customer node as the dimension classification rule of the historical customer group, and the dimension classification rule of the historical customer group Dimensions include one-dimensional, two-dimensional or three-dimensional dimensions.
  • the dimension in the dimension classification rules of historical customer groups as an example, assuming that at most N decision tree models (that is, multiple decision tree models) will eventually generate M Leaf nodes (that is, leaf nodes of multiple initial customer groups), select leaf nodes with a higher training mean lift index (node_mean_lift) among the M leaf nodes (assuming K, that is, target customer group leaf nodes) and their corresponding decision paths (decision path) (decision path includes corresponding variables and thresholds), as the division standard of single-dimensional subdivision of bad customers (that is, the one-dimensional classification rule of historical customer groups), among which, the higher training average improvement index should be based on Considering business objectives and sample coverage comprehensively, similarly, the two-dimensional classification rules of historical customer groups and the three-dimensional classification rules of historical customer groups can be obtained, and the K group rules are used to generate K groups of parallel subdivided customer group samples, among which , samples with higher coverage can be studied individually, and samples with lower coverage can be aggregated to study their properties comprehensively.
  • the server prunes multiple decision tree models according to the dimensional classification rules of the historical customer groups to obtain a model for subdividing customer groups by decision tree rules.
  • the model for subdividing customer groups by decision tree rules is a decision tree regression model or a decision tree classification model.
  • the decision tree rule segmentation customer group model can circle more than The actual compensation amount of the overall population is more than 1.5 times higher, and the bad customer group (that is, the business risk customer group) with an accident rate higher than 1.2 times, compared with the experience group of the single-dimensional business risk customer group, 7% of the head group
  • the increase in the compensation amount and the accident rate are only about 0.5 times and 0.4 times respectively. It can be seen that the decision tree rule segment customer group model has the ability to efficiently capture business risk customer groups.
  • the server obtains the customer information samples, target variables and business classification factors of the customers to be classified, subdivides the customer group model and customer information samples through the business classification factors and decision tree rules, classifies the business risk customer groups of the customers to be classified, and obtains multiple business risk customer groups to be processed; multiple initial prediction models corresponding to each business risk customer group to be processed are constructed through the target variable, and the initial prediction model is a regression prediction model or a classification prediction model; for each business risk customer group to be processed The corresponding multiple initial prediction models are predicted and evaluated separately to obtain the evaluation value; according to the size of the evaluation value, the multiple initial prediction models corresponding to each business risk customer group to be processed are sorted in descending order, and the initial prediction model ranked first The model is determined as the target prediction model corresponding to each business risk customer group to be processed.
  • the target variable includes variables of nominal data and variables of numerical data.
  • the business variable is whether there is an accident (a variable of nominal data), and the business variable is the actual payment amount ( variables with numeric data).
  • the business classification factor is the risk factor for judging business risk customers based on business common sense. The business classification factor is used to determine that the customer corresponding to a certain value range has a higher business risk based on the risk factor.
  • the business classification factors are the driver's age, gender, historical accident records, etc.
  • the business is car damage insurance as an example
  • the business classification factors are the age, model, and date of manufacture of the car.
  • the server After the server is authorized by the customer, it extracts or grabs the customer information of the customer to be classified (ie, the customer information sample).
  • the label information of the information sample through the business classification factor, classifies the label information to obtain the classified label information, and determines the customer to be classified corresponding to the classified label information as the customer group of the experience classification; and through the decision-making corresponding to the target variable
  • the tree rule subdivides the customer group model, classifies the customer information samples of the customers to be classified into business risk customer groups, and obtains the customer groups classified by the decision tree; the server can combine the customer groups classified by experience and the customer groups classified by the decision tree to obtain multiple customer groups with business risks to be processed; or the server can obtain the same customer groups and different customer groups by comparing and analyzing the customer groups classified by experience and the customer groups classified by the decision tree, and call the preset support vector machine (support vector machine, SVM) classification model, classify the customer information samples of different customer groups into business risk customer groups, obtain target customer groups, merge the same customer groups and target
  • the execution process of the server constructing multiple initial prediction models corresponding to each business risk customer group to be processed according to the target variable includes: when the target variable is a variable of nominal data, constructing multiple initial prediction models corresponding to each business risk customer group to be processed Classification prediction model, and train and optimize multiple classification prediction models to obtain the initial prediction model, which is used to classify the degree of business risk of the business risk customer group to be processed; when the target variable is a variable of numerical data, construct each multiple regression prediction models corresponding to a business risk customer group to be processed, and train and optimize the multiple regression prediction models to obtain an initial prediction model, and the regression prediction model is used to perform regression processing of the business risk value of the business risk customer group to be processed, Based on business considerations for model interpretability, the regression prediction model can use models with strong explanatory properties such as generalized linear regression and decision tree regression. Consider using sophisticated machine learning models;
  • the multiple initial prediction models corresponding to each business risk customer group to be processed are respectively predicted, and the target prediction value corresponding to each initial prediction model in each business risk customer group to be processed is obtained, and the target prediction
  • the customer group corresponding to the preset ratio of the first end in reverse order of value (for example: the preset ratio of the first end is top5%) is determined as the customer group to be analyzed; the real classified customer group is obtained, and the customer group to be analyzed and the real classified customer group are calculated to calculate each Arrange the evaluation values in descending order, and determine the initial prediction model with the first evaluation value as the target prediction model corresponding to each business risk customer group to be processed.
  • the risk level of the top 5% customers with high business risks predicted by separate modeling (ie, the target prediction model) within the segmented customer groups is higher;
  • the mean value of the actual compensation amount (ie, the target variable) of the top 5% customers with high business risk predicted by segmented customer group modeling (ie, the target forecast model) is higher than that of the top 5% customers with high risks (ie, the target forecast model) predicted by the overall model About 40%;
  • the average number of trips (ie target variable) of the top 5% customers with high business risks predicted by segmented customer group modeling (ie target prediction model) has increased by about 20% compared with the overall model.
  • each circled bad customer that is, the business risk customer group to be processed
  • the target prediction model that is, the target prediction model
  • the decision tree rule segment customer group model can be clearly output, which is different from the prediction results of complex models for a business risk customer, which are difficult for front-end salesmen to understand.
  • the rules output by the decision tree rule segmentation customer group model are relatively intuitive; the variables used to segment business risk customer groups can be expanded, combined with front-end business knowledge, and more important variables are selected for expansion, effectively combining technology and business experience; the ability to flexibly determine the criteria for subdividing business risk customer groups; In-depth description of the business risk customer group, using the target prediction model capability to find customers with higher business risk within the group, ensuring the accuracy of the target prediction model for capturing the business risk customer group to be processed, thus improving the quality of the customer base based on the massive customer group The accuracy of business risk customer group identification.
  • step 203 The execution process of step 203 is similar to the execution process of step 101 above, and will not be repeated here.
  • the server obtains the target dimension corresponding to the business variable, and the target dimension is any one of the one-dimensional variable, two-dimensional variable and three-dimensional variable corresponding to the business variable; calls the decision tree rule corresponding to the target dimension to segment the customer group model , based on the preset dimension classification rules, the customer information of the customer group to be processed is calculated and compared with the threshold, and the analysis result is obtained; the customer group to be processed is classified according to the analysis result, and the decision tree classification customer group is obtained.
  • the server obtains the target dimension corresponding to the business variable, the target dimension includes any one of the one-dimensional variable, two-dimensional variable and three-dimensional variable of the business variable, and creates the structured query language of the target dimension; Query the decision tree model in the database to obtain the corresponding decision tree rule subdivision customer group model and model call address; call the decision tree rule subdivision customer group model through the model call address, and calculate the decision tree based on the preset dimension classification rules
  • the target variable mean promotion degree of the customer information of the customer group to be processed in each customer group leaf node in the rule subdivided customer group model, and calculate the comprehensive mean promotion degree of the customer information of the customer group to be processed in all the customer group leaf nodes, and determine the target Whether the variable mean value promotion degree is greater than the comprehensive mean value promotion degree is obtained, and the analysis result is obtained.
  • the customer group to be processed in the corresponding customer group leaf node is determined as the customer group to be classified, and the target variable of the customer group to be classified is The average lift is compared with the preset lift index standard, and the customer group to be classified whose average lift of the target variable meets the lift index standard is classified as the risk group corresponding to the lift index standard, so as to obtain the decision tree classification customer group,
  • the promotion index standard includes the promotion index standard of potential business risk, the promotion index standard of low business risk customer group, the promotion degree index standard of medium business risk customer group and the promotion degree index standard of high business risk customer group, for example :
  • the lift index standard for medium-risk customers is 2, and the lift index standard for high-risk customers is 3.
  • step 205 The execution process of step 205 is similar to the execution process of step 103 above, and will not be repeated here.
  • the server obtains the business risk customer group information and customer group type of the initial business risk customer group, traverses the preset prediction model structure tree through the customer group type, and obtains the target prediction model corresponding to the initial business risk customer group, and the target prediction model
  • the interface call address of the model call the target prediction model through the interface call address, perform regression processing or classification processing based on business risk on the business risk customer group information, and obtain the business risk prediction value.
  • the server obtains the business risk customer group information of the initial business risk customer group, the business risk customer group information includes the customer information of the initial business risk customer group and the classification label information of the customer information, calls the preset label extraction algorithm, and classifies the label information Extract the classification type to obtain the customer group type; create the index of the customer group type, and traverse the preset prediction model structure tree through the index, so as to obtain the corresponding target prediction model and the interface call address of the target prediction model; obtain the business risk prediction index, through The interface call address calls the target prediction model, and based on the business risk prediction index, performs regression processing or classification processing based on business risk on the business risk customer group information to obtain the business risk prediction value.
  • the target prediction model when the target prediction model performs regression processing based on business risk
  • the business risk prediction value corresponds to the business variable, it is the numerical data of the business risk.
  • the business variable is the actual compensation amount as an example
  • the business risk prediction value is the actual compensation amount data in the business risk.
  • the target prediction model When performing classification processing based on business risk, the predicted value of business risk is the probability value of business risk at each level.
  • the initial business risk customer group is screened to obtain the target business risk customer group.
  • the server sorts the customers in the initial business risk customer group in order of business risk prediction value from large to small to obtain a sequence of candidate business risk customer groups; Sequence selection to obtain the target business risk customer group.
  • the server sorts the initial business risk customer groups in reverse order according to the size of the business risk prediction value, and obtains the candidate business risk customer group sequence; obtains the data type of the business risk prediction value; if the data type is a business variable, it corresponds to the numerical data of the business risk , then obtain the preset business risk range value, divide the pre-candidate business risk customer group sequence according to the business risk range value and business risk prediction value, and obtain the business risk range value corresponding to the candidate business risk customer group, based on the preset ratio , read the business risk range value corresponding to the candidate business risk customer group sequentially to obtain the target business risk customer group; if the data type is the probability value of each level of business risk, select the business risk prediction value in the candidate business risk customer group sequence Customers greater than the preset threshold are obtained as selected customer groups, and based on the preset ratio, the selected customer groups are sequentially selected to obtain target business risk customer groups.
  • the accuracy of the initial business risk customer group is improved and the realization of Under the premise of increasing the coverage of customer groups, the accuracy of identification of high-risk customer groups (ie target business risk customer groups) is guaranteed, thereby improving the accuracy of business risk customer group identification based on massive customer groups.
  • the identification method of the business risk customer group in the embodiment of the application is described above, and the identification device of the business risk customer group in the embodiment of the application is described below, please refer to FIG. 3, the identification device of the business risk customer group in the embodiment of the application One embodiment includes:
  • the classification module 301 is used to obtain the customer information, business variables and experience classification factors of the customer groups to be processed, and classify the customer information of the customer groups to be processed through the experience classification factors to obtain the customer group of experience classification;
  • the first prediction module 302 is used to call the decision tree rule subdividing the customer group model corresponding to the business variable, perform business risk customer group prediction on the customer information of the customer group to be processed, and obtain the decision tree classification customer group;
  • the merging module 303 is used to merge the experience-classified customer groups and the decision tree-classified customer groups to obtain the initial business risk customer group;
  • the second prediction module 304 is used to obtain the business risk customer group information of the initial business risk customer group, call the target prediction model corresponding to the initial business risk customer group, perform business risk prediction on the business risk classification customer group information, and obtain the business risk prediction value;
  • the screening module 305 is configured to screen the initial business risk customer group according to the business risk prediction value to obtain the target business risk customer group.
  • each module in the identification device of the business risk customer group corresponds to the steps in the above embodiment of the identification method of the business risk customer group, and its functions and implementation process will not be repeated here.
  • the coverage rate of the target business risk customer group is ensured by classifying the diverse customer groups to be processed, and the customers with higher business risks within the group are found through the target prediction model, which ensures that the target prediction model is Accuracy of capturing target business risk customer groups, through empirical classification factors and decision tree rules to subdivide customer group model to classify customer information of the customer group to be processed, improves the accuracy of the initial business risk customer group, and realizes
  • the accuracy of identification of high-risk customer groups that is, target business risk customer groups
  • the accuracy of identification of high-risk customer groups is guaranteed, thereby improving the accuracy of business risk customer group identification based on a large number of customer groups.
  • FIG. 4 another embodiment of the identification device of the business risk customer group in the embodiment of the present application includes:
  • the first creation module 306 is used to obtain sample variables, and create a decision tree rule segmented customer group model corresponding to the sample variable, where the decision tree rule segmented customer group model is a decision tree regression model or a decision tree classification model;
  • the second creation module 307 is used to obtain a plurality of business risk customer groups to be processed through the decision tree rule segment customer group model, and construct a target prediction model corresponding to each business risk customer group to be processed;
  • the classification module 301 is used to obtain the customer information, business variables and experience classification factors of the customer groups to be processed, and classify the customer information of the customer groups to be processed through the experience classification factors to obtain the customer group of experience classification;
  • the first prediction module 302 is used to call the decision tree rule subdividing the customer group model corresponding to the business variable, perform business risk customer group prediction on the customer information of the customer group to be processed, and obtain the decision tree classification customer group;
  • the merging module 303 is used to merge the experience-classified customer groups and the decision tree-classified customer groups to obtain the initial business risk customer group;
  • the second prediction module 304 is used to obtain the business risk customer group information of the initial business risk customer group, call the target prediction model corresponding to the initial business risk customer group, perform business risk prediction on the business risk classification customer group information, and obtain the business risk prediction value;
  • the screening module 305 is configured to screen the initial business risk customer group according to the business risk prediction value to obtain the target business risk customer group.
  • the first creating module 306 may also be specifically used for:
  • the second creating module 307 may also be specifically used for:
  • the first prediction module 302 may also be specifically used for:
  • the target dimension is any one of the one-dimensional variable, two-dimensional variable and three-dimensional variable corresponding to the business variable; call the decision tree rule corresponding to the target dimension to segment the customer group model, based on the preset Dimensional classification rules, the customer information of the customer group to be processed is calculated and compared with the threshold value, and the analysis result is obtained; according to the analysis result, the customer group to be processed is classified, and the decision tree classification customer group is obtained.
  • the second prediction module 304 may also be specifically used for:
  • Obtain the business risk customer group information and customer group type of the initial business risk customer group traverse the preset prediction model structure tree through the customer group type, and obtain the target prediction model corresponding to the initial business risk customer group, and the interface call of the target prediction model Address: call the target prediction model through the interface call address, and perform regression processing or classification processing based on business risk on the business risk customer group information to obtain the business risk prediction value.
  • the screening module 305 can also be specifically used for:
  • the customers in the initial business risk customer group are sorted to obtain the candidate business risk customer group sequence; based on the preset ratio, the candidate business risk customer group sequence is sequentially selected to obtain Target business risk customers.
  • each module and each unit in the identification device of the business risk customer group corresponds to each step in the above embodiment of the identification method of the business risk customer group, and its functions and implementation process will not be repeated here.
  • the accuracy of the initial business risk customer group is improved and the realization of Under the premise of increasing the coverage of customer groups, the accuracy of identification of high-risk customer groups (ie target business risk customer groups) is guaranteed, thereby improving the accuracy of business risk customer group identification based on massive customer groups.
  • FIG 3 and Figure 4 above describe in detail the identification device of the business risk customer group in the embodiment of the present application from the perspective of the modular functional entity, and the identification equipment of the business risk customer group in the embodiment of the application is described in detail below from the perspective of hardware processing describe.
  • Fig. 5 is a schematic structural diagram of an identification device for a business risk customer group provided by an embodiment of the present application.
  • the identification device 500 for a business risk customer group may have relatively large differences due to different configurations or performances, and may include one or more than one Processor (central processing units, CPU) 510 (for example, one or more processors) and memory 520, one or more storage media 530 for storing application programs 533 or data 532 (for example, one or more mass storage devices).
  • the memory 520 and the storage medium 530 may be temporary storage or persistent storage.
  • the program stored in the storage medium 530 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations in the device 500 for identifying business risk groups.
  • the processor 510 may be configured to communicate with the storage medium 530, and execute a series of instruction operations in the storage medium 530 on the identification device 500 of the business risk customer group.
  • the identification device 500 of the business risk customer group may also include one or more power sources 540, one or more wired or wireless network interfaces 550, one or more input and output interfaces 560, and/or, one or more operating systems 531, Such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, etc.
  • operating systems 531 Such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, etc.
  • the present application also provides an identification device for business risk customer groups, including: a memory and at least one processor, instructions are stored in the memory, and the memory and the at least one processor are interconnected through lines; the at least one The processor invokes the instruction in the memory, so that the identification device of the business risk customer group executes the steps in the above method for identifying the business risk customer group.
  • the present application also provides a computer-readable storage medium, the computer-readable storage medium may be a non-volatile computer-readable storage medium, the computer-readable storage medium may also be a volatile computer-readable storage medium, and the computer-readable storage medium may be Instructions are stored in the readable storage medium, and when the instructions are run on the computer, the computer is made to execute the steps of the method for identifying business risk customer groups.
  • the computer-readable storage medium may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function, etc.; Use the created data etc.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with each other using cryptographic methods. Each data block contains a batch of network transaction information, which is used to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods in various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Evolutionary Biology (AREA)
  • Strategic Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Evolutionary Computation (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种业务风险客群的识别方法、装置、设备及存储介质,涉及人工智能领域,用于提高基于海量客群的业务风险客群识别的准确性。该方法包括:通过经验分类因子对待处理客群的客户信息进行客群分类得到经验分类客群(101);调用与业务变量对应的决策树规则细分客群模型对待处理客群的客户信息进行业务风险客群预测得到决策树分类客群(102);将经验分类客群和决策树分类客群进行合并,得到初始业务风险客群(103);调用目标预测模型对业务风险分类客群信息进行业务风险预测,得到业务风险预测值(104);通过业务风险预测值对初始业务风险客群进行筛选得到目标业务风险客群(105)。此外,还涉及区块链技术,待处理客群的客户信息可存储于区块链中。

Description

业务风险客群的识别方法、装置、设备及存储介质
本申请要求于2021年07月06日提交中国专利局、申请号为202110762845.4、发明名称为“业务风险客群的识别方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
技术领域
本申请涉及人工智能的智能决策领域,尤其涉及一种业务风险客群的识别方法、装置、设备及存储介质。
背景技术
基于业务风险可控的经营原则,需要对高业务风险客户进行识别,并针对性地进行干预。目前的高风险客群识别方法一般是基于整体客群进行建模,得到客群识别模型,再根据客群识别模型对整体客群进行业务判断,并定义业务判断的预测值高于某一阈值,或业务判断的预测值排序头部的客群为目标客户(即高风险客群)。
发明人意识到上述高风险客群识别方法中,往往很难在客群覆盖率提升的前提下,不降低高风险客群识别的准确度,因此,导致了基于海量客群的业务风险客群识别的准确性低。
发明内容
本申请提供一种业务风险客群的识别方法、装置、设备及存储介质,用于提高基于海量客群的业务风险客群识别的准确性。
本申请第一方面提供了一种业务风险客群的识别方法,包括:
获取待处理客群的客户信息、业务变量和经验分类因子,通过所述经验分类因子,对所述待处理客群的客户信息进行客群分类,得到经验分类客群;
调用与所述业务变量对应的决策树规则细分客群模型,对所述待处理客群的客户信息进行业务风险客群预测,得到决策树分类客群;
将所述经验分类客群和所述决策树分类客群进行合并,得到初始业务风险客群;
获取所述初始业务风险客群的业务风险客群信息,调用与所述初始业务风险客群对应的目标预测模型,对所述业务风险分类客群信息进行业务风险预测,得到业务风险预测值;
通过所述业务风险预测值,对所述初始业务风险客群进行筛选,得到目标业务风险客群。
本申请第二方面提供了一种业务风险客群的识别设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:
获取待处理客群的客户信息、业务变量和经验分类因子,通过所述经验分类因子,对所述待处理客群的客户信息进行客群分类,得到经验分类客群;
调用与所述业务变量对应的决策树规则细分客群模型,对所述待处理客群的客户信息进行业务风险客群预测,得到决策树分类客群;
将所述经验分类客群和所述决策树分类客群进行合并,得到初始业务风险客群;
获取所述初始业务风险客群的业务风险客群信息,调用与所述初始业务风险客群对应的目标预测模型,对所述业务风险分类客群信息进行业务风险预测,得到业务风险预测值;
通过所述业务风险预测值,对所述初始业务风险客群进行筛选,得到目标业务风险客群。
本申请第三方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:
获取待处理客群的客户信息、业务变量和经验分类因子,通过所述经验分类因子,对 所述待处理客群的客户信息进行客群分类,得到经验分类客群;
调用与所述业务变量对应的决策树规则细分客群模型,对所述待处理客群的客户信息进行业务风险客群预测,得到决策树分类客群;
将所述经验分类客群和所述决策树分类客群进行合并,得到初始业务风险客群;
获取所述初始业务风险客群的业务风险客群信息,调用与所述初始业务风险客群对应的目标预测模型,对所述业务风险分类客群信息进行业务风险预测,得到业务风险预测值;
通过所述业务风险预测值,对所述初始业务风险客群进行筛选,得到目标业务风险客群。
本申请第四方面提供了一种业务风险客群的识别装置,包括:
分类模块,用于获取待处理客群的客户信息、业务变量和经验分类因子,通过所述经验分类因子,对所述待处理客群的客户信息进行客群分类,得到经验分类客群;
第一预测模块,用于调用与所述业务变量对应的决策树规则细分客群模型,对所述待处理客群的客户信息进行业务风险客群预测,得到决策树分类客群;
合并模块,用于将所述经验分类客群和所述决策树分类客群进行合并,得到初始业务风险客群;
第二预测模块,用于获取所述初始业务风险客群的业务风险客群信息,调用与所述初始业务风险客群对应的目标预测模型,对所述业务风险分类客群信息进行业务风险预测,得到业务风险预测值;
筛选模块,用于通过所述业务风险预测值,对所述初始业务风险客群进行筛选,得到目标业务风险客群。
本申请提供的技术方案中,获取待处理客群的客户信息、业务变量和经验分类因子,通过所述经验分类因子,对所述待处理客群的客户信息进行客群分类,得到经验分类客群;调用与所述业务变量对应的决策树规则细分客群模型,对所述待处理客群的客户信息进行业务风险客群预测,得到决策树分类客群;将所述经验分类客群和所述决策树分类客群进行合并,得到初始业务风险客群;获取所述初始业务风险客群的业务风险客群信息,调用与所述初始业务风险客群对应的目标预测模型,对所述业务风险分类客群信息进行业务风险预测,得到业务风险预测值;通过所述业务风险预测值,对所述初始业务风险客群进行筛选,得到目标业务风险客群。本申请实施例中,通过对待处理客群进行多样性的客群分类,保障了目标业务风险客群的覆盖率,通过目标预测模型找到群体内部业务风险更高的客户,保证了目标预测模型对于目标业务风险客群抓取的准确度,通过经验分类因子和决策树规则细分客群模型对待处理客群的客户信息进行客群分类,提高了初始业务风险客群的准确性,实现了在客群覆盖率提升的前提下,保证了高风险客群(即目标业务风险客群)识别的准确度,从而提高了基于海量客群的业务风险客群识别的准确性。
附图说明
图1为本申请实施例中业务风险客群的识别方法的一个实施例示意图;
图2为本申请实施例中业务风险客群的识别方法的另一个实施例示意图;
图3为本申请实施例中业务风险客群的识别装置的一个实施例示意图;
图4为本申请实施例中业务风险客群的识别装置的另一个实施例示意图;
图5为本申请实施例中业务风险客群的识别设备的一个实施例示意图。
具体实施方式
本申请实施例提供了一种业务风险客群的识别方法、装置、设备及存储介质,提高了基于海量客群的业务风险客群识别的准确性。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四” 等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”或“具有”及其任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
为便于理解,下面对本申请实施例的具体流程进行描述,请参阅图1,本申请实施例中业务风险客群的识别方法的一个实施例包括:
101、获取待处理客群的客户信息、业务变量和经验分类因子,通过经验分类因子,对待处理客群的客户信息进行客群分类,得到经验分类客群。
可以理解的是,本申请的执行主体可以为业务风险客群的识别装置,还可以是终端或者服务器,具体此处不做限定。本申请实施例以服务器为执行主体为例进行说明。
其中,经验分类客群的数量包括一个或一个以上。业务变量包括标称型数据的变量和数值型数据的变量,例如:以业务为保险理赔为例说明,业务变量为是否出险(标称型数据的变量),业务变量为实际赔付金额(数值型数据的变量)。经验分类因子为根据业务常识判断业务风险客户的风险因子,经验分类因子用于根据风险因子确定某一取值范围内容对应的客户拥有较高业务风险,例如:以业务为汽车责任保险为例说明,经验分类因子为驾驶人的年龄、性别、历史出险记录等,又以业务为车损险为例说明,经验分类因子为汽车的车龄、车型和出厂日期等。
服务器获得客户授权后,提取或抓取待处理客群的客户信息,待处理客群的客户信息包括客户的个人信息、业务信息,以及经验分类因子对应的信息;通过预置的标签提取算法,对待处理客群的客户信息进行标签信息提取,得到待处理客群标签信息;通过经验分类因子对待处理客群标签信息进行分类,得到分类标签信息,将分类标签信息对应的待处理客群确定为经验分类客群,以实现通过经验分类因子,对待处理客群的客户信息的客群分类。
102、调用与业务变量对应的决策树规则细分客群模型,对待处理客群的客户信息进行业务风险客群预测,得到决策树分类客群。
其中,业务风险客群分类后的决策树分类客群包括有业务风险客群和无业务风险客群,有业务风险客群包括潜在业务风险客群、低业务风险客群、中业务风险客群和高业务风险客群。决策树规则细分客群模型利用分叉时的目标为均方误差减少的原理,将待处理客群划分为业务变量取值最为接近的几个子集,再识别子集中业务变量的均值显著高于总体业务变量的均值的群体,标记为坏客户的细分客群(即决策树分类客群)。
当业务变量为称型数据的变量时,服务器调用与业务变量对应的决策树规则细分客群模型,该决策树规则细分客群模型为决策树分类模型,基于预设维度分类规则,对待处理客群的客户信息进行业务风险客群分类,得到决策树分类客群;当业务变量为数值型数据的变量时,服务器调用与业务变量对应的决策树规则细分客群模型,该决策树规则细分客群模型为决策树回归模型,基于预设维度分类规则,对待处理客群的客户信息进行业务变量回归,得到业务变量值,将业务变量值与预设的业务风险阈值条件进对比,并将业务变量值符合业务风险阈值条件的待处理客群划分为业务风险阈值条件对应的业务风险群体,计算业务风险群体的业务变量值的均值,得到群体子集均值,并计算所有待处理客群的业务变量值的均值,得到综合均值,将群体子集均值大于综合均值的业务风险群体确定为决策树分类客群;或者,服务器通过对待处理客群的客户信息进行均值提升度计算和阈值对比分析,得到分析结果,根据分析结果对所述待处理客群进行分类,得到决策树分类客群。
103、将经验分类客群和决策树分类客群进行合并,得到初始业务风险客群。
服务器将经验分类客群和决策树分类客群进行合并为一个集合,从而得到初始业务风险客群,即初始业务风险客群保留了原始的经验分类客群和原始的决策树分类客群;或者,服务器将经验分类客群和决策树分类客群进行对比分析和分类,得到同类客群和异类客群,同类客群为经验分类客群和决策树分类客群中相同的客群,异类客群为经验分类客群和决策树分类客群中不相同的客群,将同类客群进行去重融合,得到融合后的客群,将融合后的客群和异类客群合并为一个集合,得到初始业务风险客群。
104、获取初始业务风险客群的业务风险客群信息,调用与初始业务风险客群对应的目标预测模型,对业务风险分类客群信息进行业务风险预测,得到业务风险预测值。
服务器获取初始业务风险客群的业务风险客群类型,通过预先创建的业务风险客群类型与目标预测模型之间的对应关系,检索得到与初始业务风险客群对应的目标预测模型;获取目标预测模型的运算因子,以及初始业务风险客群的业务风险客群信息,该运算因子为基于业务变量和业务需求而构建的业务风险预测指标,该业务风险客群信息为分类后的客群的客户信息;通过预置的接口调用地址调用目标预测模型,基于运算因子,对业务风险分类客群信息进行基于业务风险的回归处理或分类处理,得到业务风险预测值。
105、通过业务风险预测值,对初始业务风险客群进行筛选,得到目标业务风险客群。
当业务风险预测值为业务变量对应的为业务风险的数值数据时,服务器按照业务风险预测值从大到小的顺序,对初始业务风险客群中的客户进行排序,得到候选业务风险客群序列,获取预设业务风险范围值,根据业务风险范围值和业务风险预测值,对预候选业务风险客群序列进行客群划分,得到目标业务风险客群;当业务风险预测值为各等级业务风险的概率值时,将业务风险预测值与预设阈值进行对比,将初始业务风险客群中业务风险预测值大于预设阈值的客户确定为待候选客群,按照业务风险预测值的大小,对待候选客群进行倒序排序,得到候选业务风险客群序列,并按照预设比例,对候选业务风险客群序列进行依序选取,得到目标业务风险客群。或者服务器按照业务风险预测值从大到小的顺序,对初始业务风险客群中的客户进行排序,得到候选业务风险客群序列;基于预设比例,对候选业务风险客群序列进行依序选取,得到目标业务风险客群。
本申请实施例中,通过对待处理客群进行多样性的客群分类,保障了目标业务风险客群的覆盖率,通过目标预测模型找到群体内部业务风险更高的客户,保证了目标预测模型对于目标业务风险客群抓取的准确度,通过经验分类因子和决策树规则细分客群模型对待处理客群的客户信息进行客群分类,提高了初始业务风险客群的准确性,实现了在客群覆盖率提升的前提下,保证了高风险客群(即目标业务风险客群)识别的准确度,从而提高了基于海量客群的业务风险客群识别的准确性。
请参阅图2,本申请实施例中业务风险客群的识别方法的另一个实施例包括:
201、获取样本变量,创建与样本变量对应的决策树规则细分客群模型,决策树规则细分客群模型为决策树回归模型或决策树分类模型。
具体地,服务器获取历史客户群体的客户信息样本集和样本变量,构建与样本变量相关的业务变量,得到多个相关业务变量;根据多个相关业务变量和样本变量构建多个决策树模型,并计算每个决策树模型的客群叶子节点的训练均值提升度,客群叶子节点包括决策路径;通过训练均值提升度对决策路径进行筛选,得到历史客户群体的维度分类规则;通过历史客户群体的维度分类规则构建多个决策树模型的决策树规则细分客群模型。
服务器获得用户授权后,提取或抓取历史客户群体的客户信息样本集和样本变量,构建与样本变量相关的业务变量,得到多个相关业务变量,并将多个相关业务变量整合为变量清单;遍历变量清单中的多个相关业务变量,得到遍历后的每个相关业务变量,根据遍 历后的每个相关业务变量和样本变量,构建客户信息样本集和历史客户群体的决策树模型,得到多个决策树模型,一个决策树模型对应一个相关业务变量,每个决策树模型包括初始客群叶子客户节点和初始客群叶子客户节点对应的决策路径,初始客群叶子客户节点包括历史客户群体和对应的客户信息样本,决策路径包括变量和阈值,其中,如果业务目标在于尽量找到样本变量(如:保单赔付金额)比较高的客群,而不介意该规则的覆盖度较低的情况时,可以在模型参数的选择上适当调低叶子节点最少样本数(min_samples_leaf)的取值。
服务器计算每个决策树模型客群中初始客群叶子节点的训练均值提升度,训练均值提升度用于指示衡量每个初始客群叶子节点内历史客户群体的样本变量均值高于样本总体样本变量值的程度,例如:以样本变量均值为平均赔付金额为例说明,训练均值提升度用于衡量每个初始客群叶子节点内客群的平均赔付金额高于样本总体(即样本总体样本变量值)的程度,可以理解为相较于总体业务风险,该初始客群叶子节点内客群的业务风险偏离程度(与总体无差别:训练均值提升度=1,较总体更高:训练均值提升度>1,较总体更低:训练均值提升度<1)。具体地,服务器在每个决策树模型内部,计算决策树模型最终分裂出的叶子节点内样本变量的均值,记为node_mean,并使用node_mean除以样本总体样本变量的均值,得到训练均值提升度。
服务器根据训练均值提升度确定多个决策树模型的目标客群叶子客户节点,并将目标客群叶子客户节点的决策路径确定为历史客户群体的维度分类规则,历史客户群体的维度分类规则中的维度包括一维度、二维度或三维度,例如:以历史客户群体的维度分类规则中的维度为一维度为例说明,假设至多N个决策树模型(即多个决策树模型)最终生成M个叶子节点(即多个初始客群叶子节点),挑选M个叶子节点中训练均值提升度指标(node_mean_lift)较高的叶子节点(假设K个,即目标客群叶子节点)及其对应的决策路径(decision path)(决策路径包括对应的变量和阈值),作为坏客户单维度细分客群的划分标准(即历史客户群体的一维度分类规则),其中,较高训练均值提升度指标应该依据业务目标和样本覆盖度综合考虑,同理,可以可得历史客户群体的二维度分类规则,以及历史客户群体的三维度分类规则,使用K组规则生成K组平行的细分客群样本,其中,覆盖度较高的样本可以单独研究,覆盖度较低的样本可以聚集起来,综合研究其特性。
服务器根据历史客户群体的维度分类规则,对多个决策树模型进行剪枝,得到决策树规则细分客群模型,决策树规则细分客群模型为决策树回归模型或决策树分类模型。
对决策树规则细分客群模型进行试验效果评估,发现决策树规则细分客群模型能够更有针对性地区分出高业务风险的客户群体,例如,以某一财险产品历史两年的承保客户为例,我们使用决策树规则细分客群模型进行建模测试,取得了更加精准的结果,如果圈定7%的高业务风险客群,决策树规则细分客群模型能够圈出比总体人群真实赔付金额高出1.5倍以上,且出险率高出1.2陪以上的坏客群(即业务风险客群),相对单一维度的业务风险客群的经验分群,7%的头部人群的赔付金额和出险率的提升仅分别在0.5倍和0.4倍左右的水平,可以看到决策树规则细分客群模型有着高效抓取业务风险客群的能力。
202、通过决策树规则细分客群模型获取多个待处理业务风险客群,并构建每个待处理业务风险客群对应的目标预测模型。
具体地,服务器获取待分类客户的客户信息样本、目标变量和业务分类因子,通过业务分类因子、决策树规则细分客群模型和客户信息样本,对待分类客户进行业务风险客群分类,得到多个待处理业务风险客群;通过目标变量,构建每个待处理业务风险客群对应的多个初始预测模型,初始预测模型为回归预测模型或分类预测模型;对每个待处理业务风险客群对应的多个初始预测模型分别进行预测评估,得到评估值;按照评估值的大小, 对每个待处理业务风险客群对应的多个初始预测模型进行降序排列,并将排列第一的初始预测模型确定为每个待处理业务风险客群对应的目标预测模型。
其中,目标变量包括标称型数据的变量和数值型数据的变量,例如:以业务为保险理赔为例说明,业务变量为是否出险(标称型数据的变量),业务变量为实际赔付金额(数值型数据的变量)。业务分类因子为根据业务常识判断业务风险客户的风险因子,业务分类因子用于根据风险因子确定某一取值范围内容对应的客户拥有较高业务风险,例如:以业务为汽车责任保险为例说明,业务分类因子为驾驶人的年龄、性别、历史出险记录等,又以业务为车损险为例说明,业务分类因子为汽车的车龄、车型和出厂日期等。
服务器获得客户授权后,提取或抓取待分类客户的客户信息(即客户信息样本),待分类客户的客户信息样本包括经过人工标注客群类别的客户群体的客户信息,获取待分类客户的客户信息样本的标签信息,通过业务分类因子,对标签信息进行分类得到分类后的标签信息,将分类后的标签信息对应的待分类客户确定为经验分类的客群;并通过与目标变量对应的决策树规则细分客群模型,对待分类客户的客户信息样本进行业务风险客群分类,得到决策树分类的客群;服务器可将经验分类的客群和决策树分类的客群进行合并,得到多个待处理业务风险客群;或者服务器可通过将经验分类的客群和决策树分类的客群进行对比分析,得到相同客群和差异客群,调用预置的支持向量机(support vector machine,SVM)分类模型,对差异客群的客户信息样本进行业务风险客群分类,得到目标客群,将相同客群和目标客群进行合并,得到多个待处理业务风险客群;
服务器根据目标变量构建每个待处理业务风险客群对应的多个初始预测模型的执行过程包括:当目标变量为标称型数据的变量时,构建每个待处理业务风险客群对应的多个分类预测模型,并对多个分类预测模型进行训练优化,得到初始预测模型,分类预测模型用于对待处理业务风险客群进行业务风险程度分类;当目标变量为数值型数据的变量时,构建每个待处理业务风险客群对应的多个归回预测模型,并对多个回归预测模型进行训练优化,得到初始预测模型,归回预测模型用于对待处理业务风险客群进行业务风险值的回归处理,基于业务对模型可解释性的考虑,回归预测模型可以使用广义线性回归、决策树回归等解释性较强的模型,而在一些营销场景或对模型结果可解释性要求不强的业务端,也可以考虑使用复杂的机器学习模型;
通过相同的目标变量,对每个待处理业务风险客群对应的多个初始预测模型分别进行预测,得到每个待处理业务风险客群中每个初始预测模型对应的目标预测值,将目标预测值倒序首端预设比例(如:首端预设比例为top5%)对应的客群,确定为待分析客群;获取真实分类客群,通过待分析客群与真实分类客群,计算每个初始预测模型的评估值,对评估值进行降序排列,并将评估值排列第一的初始预测模型确定为每个待处理业务风险客群对应的目标预测模型。
经过试验分析:与整体人群建模后预测的高业务风险top5%客户相比,按细分客群内单独建模(即目标预测模型)预测的高业务风险Top5%的客户风险程度更高;细分客群建模预测(即目标预测模型)的高业务风险top5%客户的真实赔付金额均值(即目标变量)与整体模型预测的高(即目标预测模型)风险Top5%客户相比升高约40%;细分客群建模预测(即目标预测模型)的高业务风险top5%客户的平均出险次数(即目标变量)与整体模型相比提升了约20%。分群后再单独对每一个圈出的坏客户(即待处理业务风险客群)进行建模(即目标预测模型),相当于对每个群体(即待处理业务风险客群)内部进行深入刻画,利用目标预测模型能力找到群体内部业务风险更高的客户,保证了目标预测模型对于坏客户(即待处理业务风险客群)抓取的准确度。
通过划分业务风险客群所使用到的变量和对应的切割阈值可以由决策树规则细分客群 模型明确输出,有别于复杂模型对于一个业务风险客户的预测结果很难使得前端业务员理解,是的决策树规则细分客群模型输出的规则比较直观;实现了细分业务风险客群所用到的变量可拓展,结合了前端业务知识,挑选出比较重要的变量进行拓展,有效结合了技术和业务经验的能力;能够灵活确定细分业务风险客群划分的标准;以及通过分群后再单独对即待处理业务风险客群进行建模(即目标预测模型),相当于对每个待处理业务风险客群内部进行深入刻画,利用目标预测模型能力找到群体内部业务风险更高的客户,保证了目标预测模型对于待处理业务风险客群抓取的准确度,从而提高了基于海量客群的业务风险客群识别的准确性。
203、获取待处理客群的客户信息、业务变量和经验分类因子,通过经验分类因子,对待处理客群的客户信息进行客群分类,得到经验分类客群。
该步骤203的执行过程与上述步骤101的执行过程类似,在此不再赘述。
204、调用与业务变量对应的决策树规则细分客群模型,对待处理客群的客户信息进行业务风险客群预测,得到决策树分类客群。
具体地,服务器获取与业务变量对应的目标维度,目标维度为业务变量对应的一维度变量、二维度变量和三维度变量中的任意一个;调用与目标维度对应的决策树规则细分客群模型,基于预设维度分类规则,对待处理客群的客户信息进行均值提升度计算和阈值对比分析,得到分析结果;根据分析结果对待处理客群进行分类,得到决策树分类客群。
服务器获取与业务变量对应的目标维度,该目标维度包括业务变量的一维度变量、二维度变量和三维度变量中的任意一个,创建目标维度的结构化查询语言;通过结构化查询语言,对预置数据库中的决策树模型进行查询,得到对应的决策树规则细分客群模型和模型调用地址;通过模型调用地址调用决策树规则细分客群模型,基于预设维度分类规则,计算决策树规则细分客群模型中各客群叶子节点中待处理客群的客户信息的目标变量均值提升度,并计算所有客群叶子节点中待处理客群的客户信息的综合均值提升度,判断目标变量均值提升度是否大于综合均值提升度,得到分析结果,若分析结果为是,则将对应的客群叶子节点中的待处理客群确定为待分类客群,将待分类客群的目标变量均值提升度与预设的提升度指标标准进行对比,并将目标变量均值提升度符合提升度指标标准的待分类客群分类为提升度指标标准对应的风险群体,从而得到决策树分类客群,其中,提升度指标标准包括潜在业务风险的提升度指标标准、低业务风险客群的提升度指标标准、中业务风险客群的提升度指标标准和高业务风险客群的提升度指标标准,例如:中风险客户的提升度指标标准为2,高风险客户的提升度指标标准为3。
205、将经验分类客群和决策树分类客群进行合并,得到初始业务风险客群。
该步骤205的执行过程与上述步骤103的执行过程类似,在此不再赘述。
206、获取初始业务风险客群的业务风险客群信息,调用与初始业务风险客群对应的目标预测模型,对业务风险分类客群信息进行业务风险预测,得到业务风险预测值。
具体地,服务器获取初始业务风险客群的业务风险客群信息和客群类型,通过客群类型遍历预置的预测模型结构树,得到与初始业务风险客群对应的目标预测模型,以及目标预测模型的接口调用地址;通过接口调用地址调用目标预测模型,对业务风险客群信息进行基于业务风险的回归处理或分类处理,得到业务风险预测值。
服务器获取初始业务风险客群的业务风险客群信息,该业务风险客群信息包括初始业务风险客群的客户信息以及客户信息的分类标签信息,调用预置的标签提取算法,对分类标签信息进行分类类型提取,得到客群类型;创建客群类型的索引,通过索引遍历预置的预测模型结构树,从而得到对应的目标预测模型以及目标预测模型的接口调用地址;获取业务风险预测指标,通过接口调用地址调用目标预测模型,基于业务风险预测指标,对业 务风险客群信息进行基于业务风险的回归处理或分类处理,得到业务风险预测值,其中,当目标预测模型进行基于业务风险的回归处理时,业务风险预测值为业务变量对应的为业务风险的数值数据,例如:以业务变量为实际赔付金额为例说明,则业务风险预测值为处于业务风险的实际赔付金额数据,当目标预测模型进行基于业务风险的分类处理时,业务风险预测值为各等级业务风险的概率值。
207、通过业务风险预测值,对初始业务风险客群进行筛选,得到目标业务风险客群。
具体地,服务器按照业务风险预测值从大到小的顺序,对初始业务风险客群中的客户进行排序,得到候选业务风险客群序列;基于预设比例,对候选业务风险客群序列进行依序选取,得到目标业务风险客群。
服务器按照业务风险预测值的大小,对初始业务风险客群进行倒序排序,得到候选业务风险客群序列;获取业务风险预测值的数据类型;若数据类型为业务变量对应的为业务风险的数值数据,则获取预设业务风险范围值,根据业务风险范围值和业务风险预测值,对预候选业务风险客群序列进行客群划分,得到业务风险范围值对应候选业务风险客群,基于预设比例,对业务风险范围值对应候选业务风险客群进行依序读取,得到目标业务风险客群;若数据类型为各等级业务风险的概率值,则选取候选业务风险客群序列中业务风险预测值大于预设阈值的客户,得到选取的客群,基于预设比例,对选取的客群进行依序选取,得到目标业务风险客群。
本申请实施例中,通过创建具有高效抓取业务风险客群的能力的决策树规则细分客群模型,以及能够找到群体内部业务风险更高的客户的目标预测模型,保证了对目标业务风险客群抓取的准确度;通过对待处理客群进行多样性的客群分类,保障了目标业务风险客群的覆盖率,通过目标预测模型找到群体内部业务风险更高的客户,保证了目标预测模型对于目标业务风险客群抓取的准确度,通过经验分类因子和决策树规则细分客群模型对待处理客群的客户信息进行客群分类,提高了初始业务风险客群的准确性,实现了在客群覆盖率提升的前提下,保证了高风险客群(即目标业务风险客群)识别的准确度,从而提高了基于海量客群的业务风险客群识别的准确性。
上面对本申请实施例中业务风险客群的识别方法进行了描述,下面对本申请实施例中业务风险客群的识别装置进行描述,请参阅图3,本申请实施例中业务风险客群的识别装置一个实施例包括:
分类模块301,用于获取待处理客群的客户信息、业务变量和经验分类因子,通过经验分类因子,对待处理客群的客户信息进行客群分类,得到经验分类客群;
第一预测模块302,用于调用与业务变量对应的决策树规则细分客群模型,对待处理客群的客户信息进行业务风险客群预测,得到决策树分类客群;
合并模块303,用于将经验分类客群和决策树分类客群进行合并,得到初始业务风险客群;
第二预测模块304,用于获取初始业务风险客群的业务风险客群信息,调用与初始业务风险客群对应的目标预测模型,对业务风险分类客群信息进行业务风险预测,得到业务风险预测值;
筛选模块305,用于通过业务风险预测值,对初始业务风险客群进行筛选,得到目标业务风险客群。
上述业务风险客群的识别装置中各个模块的功能实现与上述业务风险客群的识别方法实施例中各步骤相对应,其功能和实现过程在此处不再一一赘述。
本申请实施例中,通过对待处理客群进行多样性的客群分类,保障了目标业务风险客群的覆盖率,通过目标预测模型找到群体内部业务风险更高的客户,保证了目标预测模型 对于目标业务风险客群抓取的准确度,通过经验分类因子和决策树规则细分客群模型对待处理客群的客户信息进行客群分类,提高了初始业务风险客群的准确性,实现了在客群覆盖率提升的前提下,保证了高风险客群(即目标业务风险客群)识别的准确度,从而提高了基于海量客群的业务风险客群识别的准确性。
请参阅图4,本申请实施例中业务风险客群的识别装置的另一个实施例包括:
第一创建模块306,用于获取样本变量,创建与样本变量对应的决策树规则细分客群模型,决策树规则细分客群模型为决策树回归模型或决策树分类模型;
第二创建模块307,用于通过决策树规则细分客群模型获取多个待处理业务风险客群,并构建每个待处理业务风险客群对应的目标预测模型;
分类模块301,用于获取待处理客群的客户信息、业务变量和经验分类因子,通过经验分类因子,对待处理客群的客户信息进行客群分类,得到经验分类客群;
第一预测模块302,用于调用与业务变量对应的决策树规则细分客群模型,对待处理客群的客户信息进行业务风险客群预测,得到决策树分类客群;
合并模块303,用于将经验分类客群和决策树分类客群进行合并,得到初始业务风险客群;
第二预测模块304,用于获取初始业务风险客群的业务风险客群信息,调用与初始业务风险客群对应的目标预测模型,对业务风险分类客群信息进行业务风险预测,得到业务风险预测值;
筛选模块305,用于通过业务风险预测值,对初始业务风险客群进行筛选,得到目标业务风险客群。
可选的,第一创建模块306还可以具体用于:
获取历史客户群体的客户信息样本集和样本变量,构建与样本变量相关的业务变量,得到多个相关业务变量;根据多个相关业务变量和样本变量构建多个决策树模型,并计算每个决策树模型的客群叶子节点的训练均值提升度,客群叶子节点包括决策路径;通过训练均值提升度对决策路径进行筛选,得到历史客户群体的维度分类规则;通过历史客户群体的维度分类规则构建多个决策树模型的决策树规则细分客群模型。
可选的,第二创建模块307还可以具体用于:
获取待分类客户的客户信息样本、目标变量和业务分类因子,通过业务分类因子、决策树规则细分客群模型和客户信息样本,对待分类客户进行业务风险客群分类,得到多个待处理业务风险客群;通过目标变量,构建每个待处理业务风险客群对应的多个初始预测模型,初始预测模型为回归预测模型或分类预测模型;对每个待处理业务风险客群对应的多个初始预测模型分别进行预测评估,得到评估值;按照评估值的大小,对每个待处理业务风险客群对应的多个初始预测模型进行降序排列,并将排列第一的初始预测模型确定为每个待处理业务风险客群对应的目标预测模型。
可选的,第一预测模块302还可以具体用于:
获取与业务变量对应的目标维度,目标维度为业务变量对应的一维度变量、二维度变量和三维度变量中的任意一个;调用与目标维度对应的决策树规则细分客群模型,基于预设维度分类规则,对待处理客群的客户信息进行均值提升度计算和阈值对比分析,得到分析结果;根据分析结果对待处理客群进行分类,得到决策树分类客群。
可选的,第二预测模块304还可以具体用于:
获取初始业务风险客群的业务风险客群信息和客群类型,通过客群类型遍历预置的预测模型结构树,得到与初始业务风险客群对应的目标预测模型,以及目标预测模型的接口调用地址;通过接口调用地址调用目标预测模型,对业务风险客群信息进行基于业务风险 的回归处理或分类处理,得到业务风险预测值。
可选的,筛选模块305还可以具体用于:
按照业务风险预测值从大到小的顺序,对初始业务风险客群中的客户进行排序,得到候选业务风险客群序列;基于预设比例,对候选业务风险客群序列进行依序选取,得到目标业务风险客群。
上述业务风险客群的识别装置中各模块和各单元的功能实现与上述业务风险客群的识别方法实施例中各步骤相对应,其功能和实现过程在此处不再一一赘述。
本申请实施例中,通过创建具有高效抓取业务风险客群的能力的决策树规则细分客群模型,以及能够找到群体内部业务风险更高的客户的目标预测模型,保证了对目标业务风险客群抓取的准确度;通过对待处理客群进行多样性的客群分类,保障了目标业务风险客群的覆盖率,通过目标预测模型找到群体内部业务风险更高的客户,保证了目标预测模型对于目标业务风险客群抓取的准确度,通过经验分类因子和决策树规则细分客群模型对待处理客群的客户信息进行客群分类,提高了初始业务风险客群的准确性,实现了在客群覆盖率提升的前提下,保证了高风险客群(即目标业务风险客群)识别的准确度,从而提高了基于海量客群的业务风险客群识别的准确性。
上面图3和图4从模块化功能实体的角度对本申请实施例中的业务风险客群的识别装置进行详细描述,下面从硬件处理的角度对本申请实施例中业务风险客群的识别设备进行详细描述。
图5是本申请实施例提供的一种业务风险客群的识别设备的结构示意图,该业务风险客群的识别设备500可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)510(例如,一个或一个以上处理器)和存储器520,一个或一个以上存储应用程序533或数据532的存储介质530(例如一个或一个以上海量存储设备)。其中,存储器520和存储介质530可以是短暂存储或持久存储。存储在存储介质530的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对业务风险客群的识别设备500中的一系列指令操作。更进一步地,处理器510可以设置为与存储介质530通信,在业务风险客群的识别设备500上执行存储介质530中的一系列指令操作。
业务风险客群的识别设备500还可以包括一个或一个以上电源540,一个或一个以上有线或无线网络接口550,一个或一个以上输入输出接口560,和/或,一个或一个以上操作系统531,例如Windows Serve,Mac OS X,Unix,Linux,FreeBSD等等。本领域技术人员可以理解,图5示出的业务风险客群的识别设备结构并不构成对业务风险客群的识别设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
本申请还提供一种业务风险客群的识别设备,包括:存储器和至少一个处理器,所述存储器中存储有指令,所述存储器和所述至少一个处理器通过线路互连;所述至少一个处理器调用所述存储器中的所述指令,以使得所述业务风险客群的识别设备执行上述业务风险客群的识别方法中的步骤。本申请还提供一种计算机可读存储介质,该计算机可读存储介质可以为非易失性计算机可读存储介质,该计算机可读存储介质也可以为易失性计算机可读存储介质,计算机可读存储介质中存储有指令,当指令在计算机上运行时,使得计算机执行业务风险客群的识别方法的步骤。
进一步地,计算机可读存储介质可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等;存储数据区可存储根据区块链节点的使用所创建的数据等。
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (20)

  1. 一种业务风险客群的识别方法,其中,所述业务风险客群的识别方法包括:
    获取待处理客群的客户信息、业务变量和经验分类因子,通过所述经验分类因子,对所述待处理客群的客户信息进行客群分类,得到经验分类客群;
    调用与所述业务变量对应的决策树规则细分客群模型,对所述待处理客群的客户信息进行业务风险客群预测,得到决策树分类客群;
    将所述经验分类客群和所述决策树分类客群进行合并,得到初始业务风险客群;
    获取所述初始业务风险客群的业务风险客群信息,调用与所述初始业务风险客群对应的目标预测模型,对所述业务风险分类客群信息进行业务风险预测,得到业务风险预测值;
    通过所述业务风险预测值,对所述初始业务风险客群进行筛选,得到目标业务风险客群。
  2. 根据权利要求1所述的业务风险客群的识别方法,其中,所述获取待处理客群的客户信息、业务变量和经验分类因子,通过所述经验分类因子,对所述待处理客群的客户信息进行客群分类,得到经验分类客群之前,还包括:
    获取样本变量,创建与所述样本变量对应的决策树规则细分客群模型,所述决策树规则细分客群模型为决策树回归模型或决策树分类模型;
    通过所述决策树规则细分客群模型获取多个待处理业务风险客群,并构建每个待处理业务风险客群对应的目标预测模型。
  3. 根据权利要求2所述的业务风险客群的识别方法,其中,所述获取样本变量,创建与所述样本变量对应的决策树规则细分客群模型,包括:
    获取历史客户群体的客户信息样本集和样本变量,构建与所述样本变量相关的业务变量,得到多个相关业务变量;
    根据所述多个相关业务变量和所述样本变量构建多个决策树模型,并计算每个决策树模型的客群叶子节点的训练均值提升度,所述客群叶子节点包括决策路径;
    通过所述训练均值提升度对所述决策路径进行筛选,得到历史客户群体的维度分类规则;
    通过所述历史客户群体的维度分类规则构建所述多个决策树模型的决策树规则细分客群模型。
  4. 根据权利要求2所述的业务风险客群的识别方法,其中,所述通过所述决策树规则细分客群模型获取多个待处理业务风险客群,并构建每个待处理业务风险客群对应的目标预测模型,包括:
    获取待分类客户的客户信息样本、目标变量和业务分类因子,通过所述业务分类因子、所述决策树规则细分客群模型和所述客户信息样本,对所述待分类客户进行业务风险客群分类,得到多个待处理业务风险客群;
    通过所述目标变量,构建每个待处理业务风险客群对应的多个初始预测模型,所述初始预测模型为回归预测模型或分类预测模型;
    对每个待处理业务风险客群对应的多个初始预测模型分别进行预测评估,得到评估值;
    按照所述评估值的大小,对每个待处理业务风险客群对应的多个初始预测模型进行降序排列,并将排列第一的初始预测模型确定为每个待处理业务风险客群对应的目标预测模型。
  5. 根据权利要求1所述的业务风险客群的识别方法,其中,所述调用与所述业务变量对应的决策树规则细分客群模型,对所述待处理客群的客户信息进行业务风险客群预测,得到决策树分类客群,包括:
    获取与所述业务变量对应的目标维度,所述目标维度为所述业务变量对应的一维度变量、二维度变量和三维度变量中的任意一个;
    调用与所述目标维度对应的决策树规则细分客群模型,基于预设维度分类规则,对所述待处理客群的客户信息进行均值提升度计算和阈值对比分析,得到分析结果;
    根据所述分析结果对所述待处理客群进行分类,得到决策树分类客群。
  6. 根据权利要求1所述的业务风险客群的识别方法,其中,所述获取所述初始业务风险客群的业务风险客群信息,调用与所述初始业务风险客群对应的目标预测模型,对所述业务风险分类客群信息进行业务风险预测,得到业务风险预测值,包括:
    获取所述初始业务风险客群的业务风险客群信息和客群类型,通过所述客群类型遍历预置的预测模型结构树,得到与所述初始业务风险客群对应的目标预测模型,以及所述目标预测模型的接口调用地址;
    通过所述接口调用地址调用所述目标预测模型,对所述业务风险客群信息进行基于业务风险的回归处理或分类处理,得到业务风险预测值。
  7. 根据权利要求1-6中任一项所述的业务风险客群的识别方法,其中,所述通过所述业务风险预测值,对所述初始业务风险客群进行筛选,得到目标业务风险客群,包括:
    按照所述业务风险预测值从大到小的顺序,对所述初始业务风险客群中的客户进行排序,得到候选业务风险客群序列;
    基于预设比例,对所述候选业务风险客群序列进行依序选取,得到目标业务风险客群。
  8. 一种业务风险客群的识别设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:
    获取待处理客群的客户信息、业务变量和经验分类因子,通过所述经验分类因子,对所述待处理客群的客户信息进行客群分类,得到经验分类客群;
    调用与所述业务变量对应的决策树规则细分客群模型,对所述待处理客群的客户信息进行业务风险客群预测,得到决策树分类客群;
    将所述经验分类客群和所述决策树分类客群进行合并,得到初始业务风险客群;
    获取所述初始业务风险客群的业务风险客群信息,调用与所述初始业务风险客群对应的目标预测模型,对所述业务风险分类客群信息进行业务风险预测,得到业务风险预测值;
    通过所述业务风险预测值,对所述初始业务风险客群进行筛选,得到目标业务风险客群。
  9. 根据权利要求8所述的业务风险客群的识别设备,其中,所述获取待处理客群的客户信息、业务变量和经验分类因子,通过所述经验分类因子,对所述待处理客群的客户信息进行客群分类,得到经验分类客群之前,还包括:
    获取样本变量,创建与所述样本变量对应的决策树规则细分客群模型,所述决策树规则细分客群模型为决策树回归模型或决策树分类模型;
    通过所述决策树规则细分客群模型获取多个待处理业务风险客群,并构建每个待处理业务风险客群对应的目标预测模型。
  10. 根据权利要求9所述的业务风险客群的识别设备,其中,所述获取样本变量,创建与所述样本变量对应的决策树规则细分客群模型,包括:
    获取历史客户群体的客户信息样本集和样本变量,构建与所述样本变量相关的业务变量,得到多个相关业务变量;
    根据所述多个相关业务变量和所述样本变量构建多个决策树模型,并计算每个决策树模型的客群叶子节点的训练均值提升度,所述客群叶子节点包括决策路径;
    通过所述训练均值提升度对所述决策路径进行筛选,得到历史客户群体的维度分类规则;
    通过所述历史客户群体的维度分类规则构建所述多个决策树模型的决策树规则细分客群模型。
  11. 根据权利要求9所述的业务风险客群的识别设备,其中,所述通过所述决策树规则细分客群模型获取多个待处理业务风险客群,并构建每个待处理业务风险客群对应的目标预测模型,包括:
    获取待分类客户的客户信息样本、目标变量和业务分类因子,通过所述业务分类因子、所述决策树规则细分客群模型和所述客户信息样本,对所述待分类客户进行业务风险客群分类,得到多个待处理业务风险客群;
    通过所述目标变量,构建每个待处理业务风险客群对应的多个初始预测模型,所述初始预测模型为回归预测模型或分类预测模型;
    对每个待处理业务风险客群对应的多个初始预测模型分别进行预测评估,得到评估值;
    按照所述评估值的大小,对每个待处理业务风险客群对应的多个初始预测模型进行降序排列,并将排列第一的初始预测模型确定为每个待处理业务风险客群对应的目标预测模型。
  12. 根据权利要求8所述的业务风险客群的识别设备,其中,所述调用与所述业务变量对应的决策树规则细分客群模型,对所述待处理客群的客户信息进行业务风险客群预测,得到决策树分类客群,包括:
    获取与所述业务变量对应的目标维度,所述目标维度为所述业务变量对应的一维度变量、二维度变量和三维度变量中的任意一个;
    调用与所述目标维度对应的决策树规则细分客群模型,基于预设维度分类规则,对所述待处理客群的客户信息进行均值提升度计算和阈值对比分析,得到分析结果;
    根据所述分析结果对所述待处理客群进行分类,得到决策树分类客群。
  13. 根据权利要求8所述的业务风险客群的识别设备,其中,所述获取所述初始业务风险客群的业务风险客群信息,调用与所述初始业务风险客群对应的目标预测模型,对所述业务风险分类客群信息进行业务风险预测,得到业务风险预测值,包括:
    获取所述初始业务风险客群的业务风险客群信息和客群类型,通过所述客群类型遍历预置的预测模型结构树,得到与所述初始业务风险客群对应的目标预测模型,以及所述目标预测模型的接口调用地址;
    通过所述接口调用地址调用所述目标预测模型,对所述业务风险客群信息进行基于业务风险的回归处理或分类处理,得到业务风险预测值。
  14. 根据权利要求8-13中任一项所述的业务风险客群的识别设备,其中,所述通过所述业务风险预测值,对所述初始业务风险客群进行筛选,得到目标业务风险客群,包括:
    按照所述业务风险预测值从大到小的顺序,对所述初始业务风险客群中的客户进行排序,得到候选业务风险客群序列;
    基于预设比例,对所述候选业务风险客群序列进行依序选取,得到目标业务风险客群。
  15. 一种计算机可读存储介质,所述计算机可读存储介质中存储计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:
    获取待处理客群的客户信息、业务变量和经验分类因子,通过所述经验分类因子,对所述待处理客群的客户信息进行客群分类,得到经验分类客群;
    调用与所述业务变量对应的决策树规则细分客群模型,对所述待处理客群的客户信息进行业务风险客群预测,得到决策树分类客群;
    将所述经验分类客群和所述决策树分类客群进行合并,得到初始业务风险客群;
    获取所述初始业务风险客群的业务风险客群信息,调用与所述初始业务风险客群对应的目标预测模型,对所述业务风险分类客群信息进行业务风险预测,得到业务风险预测值;
    通过所述业务风险预测值,对所述初始业务风险客群进行筛选,得到目标业务风险客群。
  16. 根据权利要求15所述的计算机可读存储介质,其中,所述获取待处理客群的客户信息、业务变量和经验分类因子,通过所述经验分类因子,对所述待处理客群的客户信息进行客群分类,得到经验分类客群之前,还包括:
    获取样本变量,创建与所述样本变量对应的决策树规则细分客群模型,所述决策树规则细分客群模型为决策树回归模型或决策树分类模型;
    通过所述决策树规则细分客群模型获取多个待处理业务风险客群,并构建每个待处理业务风险客群对应的目标预测模型。
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述获取样本变量,创建与所述样本变量对应的决策树规则细分客群模型,包括:
    获取历史客户群体的客户信息样本集和样本变量,构建与所述样本变量相关的业务变量,得到多个相关业务变量;
    根据所述多个相关业务变量和所述样本变量构建多个决策树模型,并计算每个决策树模型的客群叶子节点的训练均值提升度,所述客群叶子节点包括决策路径;
    通过所述训练均值提升度对所述决策路径进行筛选,得到历史客户群体的维度分类规则;
    通过所述历史客户群体的维度分类规则构建所述多个决策树模型的决策树规则细分客群模型。
  18. 根据权利要求16所述的计算机可读存储介质,其中,所述通过所述决策树规则细分客群模型获取多个待处理业务风险客群,并构建每个待处理业务风险客群对应的目标预测模型,包括:
    获取待分类客户的客户信息样本、目标变量和业务分类因子,通过所述业务分类因子、所述决策树规则细分客群模型和所述客户信息样本,对所述待分类客户进行业务风险客群分类,得到多个待处理业务风险客群;
    通过所述目标变量,构建每个待处理业务风险客群对应的多个初始预测模型,所述初始预测模型为回归预测模型或分类预测模型;
    对每个待处理业务风险客群对应的多个初始预测模型分别进行预测评估,得到评估值;
    按照所述评估值的大小,对每个待处理业务风险客群对应的多个初始预测模型进行降序排列,并将排列第一的初始预测模型确定为每个待处理业务风险客群对应的目标预测模型。
  19. 根据权利要求15所述的计算机可读存储介质,其中,所述调用与所述业务变量对应的决策树规则细分客群模型,对所述待处理客群的客户信息进行业务风险客群预测,得到决策树分类客群,包括:
    获取与所述业务变量对应的目标维度,所述目标维度为所述业务变量对应的一维度变量、二维度变量和三维度变量中的任意一个;
    调用与所述目标维度对应的决策树规则细分客群模型,基于预设维度分类规则,对所述待处理客群的客户信息进行均值提升度计算和阈值对比分析,得到分析结果;
    根据所述分析结果对所述待处理客群进行分类,得到决策树分类客群。
  20. 一种业务风险客群的识别装置,其中,所述业务风险客群的识别装置包括:
    分类模块,用于获取待处理客群的客户信息、业务变量和经验分类因子,通过所述经验分类因子,对所述待处理客群的客户信息进行客群分类,得到经验分类客群;
    第一预测模块,用于调用与所述业务变量对应的决策树规则细分客群模型,对所述待处理客群的客户信息进行业务风险客群预测,得到决策树分类客群;
    合并模块,用于将所述经验分类客群和所述决策树分类客群进行合并,得到初始业务风险客群;
    第二预测模块,用于获取所述初始业务风险客群的业务风险客群信息,调用与所述初始业务风险客群对应的目标预测模型,对所述业务风险分类客群信息进行业务风险预测,得到业务风险预测值;
    筛选模块,用于通过所述业务风险预测值,对所述初始业务风险客群进行筛选,得到目标业务风险客群。
PCT/CN2022/071685 2021-07-06 2022-01-13 业务风险客群的识别方法、装置、设备及存储介质 WO2023279696A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110762845.4 2021-07-06
CN202110762845.4A CN113254510B (zh) 2021-07-06 2021-07-06 业务风险客群的识别方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2023279696A1 true WO2023279696A1 (zh) 2023-01-12

Family

ID=77190865

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/071685 WO2023279696A1 (zh) 2021-07-06 2022-01-13 业务风险客群的识别方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN113254510B (zh)
WO (1) WO2023279696A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117725313A (zh) * 2023-12-13 2024-03-19 广电运通集团股份有限公司 智能识别与推荐系统
CN118469582A (zh) * 2024-07-10 2024-08-09 宁波银行股份有限公司 一种客群管理中心系统运维方法、装置、设备及存储介质

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113254510B (zh) * 2021-07-06 2021-09-28 平安科技(深圳)有限公司 业务风险客群的识别方法、装置、设备及存储介质
CN113723957B (zh) * 2021-08-20 2023-10-27 上海浦东发展银行股份有限公司 区块链账户信息确认方法、装置、计算机设备和存储介质
CN113642669B (zh) * 2021-08-30 2024-04-05 平安医疗健康管理股份有限公司 基于特征分析的防欺诈检测方法、装置、设备及存储介质
CN113837865A (zh) * 2021-09-29 2021-12-24 重庆富民银行股份有限公司 多维度风险特征策略的提取方法
CN113935780B (zh) * 2021-10-28 2024-05-17 平安银行股份有限公司 基于生存分析的客户流失风险的预测方法、及其相关设备
CN114897099A (zh) * 2022-06-06 2022-08-12 上海淇玥信息技术有限公司 基于客群偏差平滑优化的用户分类方法、装置及电子设备
CN115423024A (zh) * 2022-09-14 2022-12-02 中国建设银行股份有限公司 数据处理方法、装置、设备、存储介质及程序产品
CN116051296B (zh) * 2022-12-28 2023-09-29 中国银行保险信息技术管理有限公司 基于标准化保险数据的客户评价分析方法及系统
CN116416054A (zh) * 2023-04-03 2023-07-11 东方微银科技股份有限公司 一种基于风险管理的小微信贷业务准入优化方法及系统
CN116307742B (zh) * 2023-05-19 2023-08-22 平安科技(深圳)有限公司 一种细分客群的风险识别方法、装置、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020116244A1 (en) * 1998-03-05 2002-08-22 American Management Systems, Inc. Decision management system providing qualitative account/customer assessment via point in time simulation
CN110807527A (zh) * 2019-09-30 2020-02-18 北京淇瑀信息科技有限公司 一种基于客群筛选的额度调整方法、装置和电子设备
CN111583017A (zh) * 2020-04-09 2020-08-25 上海淇毓信息科技有限公司 基于客群定位的风险策略生成方法、装置及电子设备
CN112215702A (zh) * 2020-10-14 2021-01-12 深圳市欢太科技有限公司 信用风险的评估方法、移动终端及计算机存储介质
CN112668859A (zh) * 2020-12-23 2021-04-16 平安普惠企业管理有限公司 基于大数据的客户风险评级方法、装置、设备及存储介质
CN113254510A (zh) * 2021-07-06 2021-08-13 平安科技(深圳)有限公司 业务风险客群的识别方法、装置、设备及存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2587429C2 (ru) * 2013-12-05 2016-06-20 Закрытое акционерное общество "Лаборатория Касперского" Система и способ оценки надежности правила категоризации
CN109978680A (zh) * 2019-03-18 2019-07-05 杭州绿度信息技术有限公司 一种细分客群信贷业务风控差异化定价的风控方法和系统
CN110796536A (zh) * 2019-10-14 2020-02-14 中国建设银行股份有限公司 风险限额确定方法及装置
CN111695824B (zh) * 2020-06-16 2024-03-29 深圳前海微众银行股份有限公司 风险尾端客户分析方法、装置、设备及计算机存储介质
CN112348659B (zh) * 2020-10-21 2024-03-19 上海淇玥信息技术有限公司 用户识别策略的分配方法、装置及电子设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020116244A1 (en) * 1998-03-05 2002-08-22 American Management Systems, Inc. Decision management system providing qualitative account/customer assessment via point in time simulation
CN110807527A (zh) * 2019-09-30 2020-02-18 北京淇瑀信息科技有限公司 一种基于客群筛选的额度调整方法、装置和电子设备
CN111583017A (zh) * 2020-04-09 2020-08-25 上海淇毓信息科技有限公司 基于客群定位的风险策略生成方法、装置及电子设备
CN112215702A (zh) * 2020-10-14 2021-01-12 深圳市欢太科技有限公司 信用风险的评估方法、移动终端及计算机存储介质
CN112668859A (zh) * 2020-12-23 2021-04-16 平安普惠企业管理有限公司 基于大数据的客户风险评级方法、装置、设备及存储介质
CN113254510A (zh) * 2021-07-06 2021-08-13 平安科技(深圳)有限公司 业务风险客群的识别方法、装置、设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117725313A (zh) * 2023-12-13 2024-03-19 广电运通集团股份有限公司 智能识别与推荐系统
CN118469582A (zh) * 2024-07-10 2024-08-09 宁波银行股份有限公司 一种客群管理中心系统运维方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN113254510B (zh) 2021-09-28
CN113254510A (zh) 2021-08-13

Similar Documents

Publication Publication Date Title
WO2023279696A1 (zh) 业务风险客群的识别方法、装置、设备及存储介质
Gepp et al. Predicting financial distress: A comparison of survival analysis and decision tree techniques
JP7090936B2 (ja) Esg基盤の企業評価遂行装置及びその作動方法
KR102068715B1 (ko) 변수 중요도에 따른 가중치가 적용된 변수를 이용한 이상값 탐지 장치 및 방법
WO2023093100A1 (zh) 一种api网关异常调用识别的方法、装置、设备及产品
Nandurge et al. Analyzing road accident data using machine learning paradigms
JP6783002B2 (ja) 企業の債務不履行予測システム及びその動作方法
CN105760889A (zh) 一种高效的不均衡数据集分类方法
CN113256409A (zh) 基于机器学习的银行零售客户流失预测方法
JP5391637B2 (ja) データ類似度計算システム、データ類似度計算方法およびデータ類似度計算プログラム
Dabab et al. A decision model for data mining techniques
CN113516189B (zh) 基于两阶段随机森林算法的网站恶意用户预测方法
Śniegula et al. Study of machine learning methods for customer churn prediction in telecommunication company
Zhao et al. Customer churn prediction based on feature clustering and nonparallel support vector machine
Tsai et al. Data pre-processing by genetic algorithms for bankruptcy prediction
CN112508363A (zh) 基于深度学习的电力信息系统状态分析方法及装置
Chang The application of machine learning models in company bankruptcy prediction
Singh et al. Multiclass imbalanced big data classification utilizing spark cluster
Chen et al. Evaluation of customer behaviour with machine learning for churn prediction: The case of bank customer churn in europe
Clemente et al. Assessing classification methods for churn prediction by composite indicators
CN113259158A (zh) 网络流量预测方法和设备、模型构建及训练方法和装置
Almas et al. Enhancing the performance of decision tree: A research study of dealing with unbalanced data
CN112396507A (zh) 基于阴影划分的集成svm个人信用评估方法
Li et al. A novel K-means classification method with genetic algorithm
CN113435655B (zh) 扇区动态管理决策方法、服务器及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22836456

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22836456

Country of ref document: EP

Kind code of ref document: A1