CN114004491A - Retail risk exposure pool dividing method and device, computer equipment and medium - Google Patents

Retail risk exposure pool dividing method and device, computer equipment and medium Download PDF

Info

Publication number
CN114004491A
CN114004491A CN202111276004.9A CN202111276004A CN114004491A CN 114004491 A CN114004491 A CN 114004491A CN 202111276004 A CN202111276004 A CN 202111276004A CN 114004491 A CN114004491 A CN 114004491A
Authority
CN
China
Prior art keywords
risk exposure
sample
retail
model
pool
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111276004.9A
Other languages
Chinese (zh)
Inventor
覃春钰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202111276004.9A priority Critical patent/CN114004491A/en
Publication of CN114004491A publication Critical patent/CN114004491A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5011Pool

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • Evolutionary Computation (AREA)
  • General Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Marketing (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Technology Law (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Educational Administration (AREA)
  • Computational Linguistics (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The embodiment of the invention provides a retail risk exposure pool dividing method, a retail risk exposure pool dividing device, computer equipment and a retail risk exposure medium, and relates to the field of automatic program design. The method comprises the steps of obtaining a sample for constructing a model in a wide characteristic table corresponding to retail risk exposure; performing high-level subdivision on the sample in response to user configuration; and subdividing samples in the sample set obtained by high-level subdivision layer by layer according to the sub-pool risk factors, and generating a retail risk exposure sub-pool model based on the high-level subdivision result and the layer-by-layer subdivision result. The embodiment of the invention realizes the standardization and automation of the construction of the retail risk exposure sub-pool model, and greatly shortens the development time of the retail risk exposure sub-pool model; in the aspect of model iteration, iteration can be performed in response to user configuration so as to quickly respond to service requirements, and the model iteration efficiency is improved; and the risk of the modeling process is controlled to a certain extent, and the error caused by manual operation is reduced.

Description

Retail risk exposure pool dividing method and device, computer equipment and medium
Technical Field
The embodiment of the invention relates to the field of automatic program design, in particular to information processing and bank risk management technologies, and particularly relates to a retail risk exposure pooling method and device, computer equipment and a medium.
Background
The retail risk exposure sub-pool system is an important component of a credit risk internal evaluation method and is mainly applied to the aspects of risk management related policy making, credit approval process, economic resource allocation, risk management and the like. The risk sub-pools are metering technologies which comprehensively consider the characteristics of account age, application score, behavior score and the like of retail risk exposure, combine debt items with consistent risk characteristics based on statistical test indexes and further describe corresponding exposed risk parameters through the statistical indexes of different sub-pools.
The analysis and construction of the current retail PD (Probability of Default) sub-pool model are mainly completed by a manual method by modeling personnel, the analysis method and steps are more, a large amount of analysis work is repeated, the modeling efficiency is low, the model construction and iteration period is long, manual analysis errors exist, the quality of the model is difficult to uniformly control, and the rapid response to supervision and business risk management requirements is difficult.
Disclosure of Invention
The embodiment of the invention provides a retail risk exposure pooling method and device, computer equipment and a medium, which can automatically construct a retail risk exposure pooling model.
In a first aspect, an embodiment of the present invention provides a retail risk exposure pooling method, including:
obtaining a sample for constructing a model in a characteristic width table corresponding to retail risk exposure;
performing high-level subdivision on the sample in response to user configuration;
and subdividing samples in the sample set obtained by high-level subdivision layer by layer according to the sub-pool risk factors, and generating a retail risk exposure sub-pool model based on the high-level subdivision result and the layer-by-layer subdivision result.
In a second aspect, an embodiment of the present invention further provides a retail risk exposure pooling device, including:
the sample acquisition module is used for acquiring a sample for constructing a model in a characteristic width table corresponding to retail risk exposure;
a high-level subdivision module for performing high-level subdivision on the samples in response to user configuration;
and the model generation module is used for subdividing the samples in the sample set obtained by high-level subdivision layer by layer according to the sub-pool risk factors and generating a retail risk exposure sub-pool model based on the high-level subdivision result and the layer-by-layer subdivision result.
In a third aspect, an embodiment of the present invention further provides a computer device, including:
one or more processors;
a memory for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a method for pooling retail risk exposure as described in any embodiment of the invention.
In a fourth aspect, embodiments of the present invention also provide a storage medium containing computer-executable instructions which, when executed by a computer processor, are used to perform a method of pooling retail risk exposure according to any of the embodiments of the present invention.
The embodiment of the invention provides a retail risk exposure pooling method which comprises the steps of obtaining a sample for constructing a model, responding to user configuration to carry out high-level subdivision on the sample, carrying out layer-by-layer subdivision on the sample in a sample set obtained by the high-level subdivision according to a pooling risk factor, and generating a retail risk exposure pooling model based on a high-level subdivision result and a layer-by-layer subdivision result. The embodiment of the invention realizes the standardization and automation of the construction of the retail risk exposure sub-pool model, and greatly shortens the development time of the retail risk exposure sub-pool model; in the aspect of model iteration, iteration can be performed in response to user configuration so as to quickly respond to service requirements, and the model iteration efficiency is improved; and the risk of the modeling process is controlled to a certain extent, and the error caused by manual operation is reduced.
Drawings
FIG. 1 is a flow chart of a method for pooling retail risk exposure provided by an embodiment of the present invention;
FIG. 2 is a flow chart of another retail risk exposure pooling method provided by embodiments of the present invention;
FIG. 3 is a flow chart of yet another retail risk exposure pooling method provided by an embodiment of the present invention;
FIG. 4 is a flow chart of yet another retail risk exposure pooling method provided by an embodiment of the present invention;
FIG. 5 is a schematic processing flow diagram of a modeling data determination module according to an embodiment of the present invention;
FIG. 6 is a schematic processing flow diagram of a modeling training data set analysis module according to an embodiment of the present invention;
FIG. 7 is a schematic processing flow diagram of a model tuning module according to an embodiment of the present invention;
fig. 8 is a schematic processing flow diagram of a parameter checking module according to an embodiment of the present invention;
FIG. 9 is a schematic processing flow diagram of a model verification module according to an embodiment of the present invention;
FIG. 10 is a block diagram of a retail risk exposure pooling device according to an embodiment of the present invention;
fig. 11 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
For ease of understanding, terms that may appear hereinafter are first explained.
Pooling refers to putting debt items with similar homogeneity and risk characteristics into the same combination or asset pool, and consistently estimating the basell risk parameters of the asset pool, including Default probability PD, Loss at Default (LGD) and Exposure At Default (EAD).
High-level subdivision refers to the process of dividing the entire population into different sub-populations based on business experience and best practice experience. Specifically, the high-level subdivision may be performed mainly considering three factors:
1) regulatory requirements separate retail risk exposure into three broad categories, namely, individual home mortgage loans, qualified cycle retail risk exposure, and other retail risk exposures.
2) The default and non-default retail risk exposures are partitioned and the default probability and default loss rate of the default and non-default retail risk exposures are estimated separately.
3) The actual situation of the business. Such as product category, account age and overdue status of retail risk exposure.
The chi-square self-interacting tree CHAID is an efficient statistical binning technique, through a statistical test method, the CHAID evaluates all values of potential predictive variables, combines the values notified statistically for target variables into a bin, retains those values with heterogeneity as separate bins, and then selects the best prediction, called the first branch of the decision tree, so that each child node is homogeneous, and the above process is iteratively executed until the whole decision tree is completed.
Fig. 1 is a flowchart of a retail risk exposure pooling method according to an embodiment of the present invention, which may be implemented by a retail risk exposure pooling device, which may be implemented by software and/or hardware and is typically configured in a computer device. As shown in fig. 1, the method includes:
and step 110, obtaining a sample for constructing the model in the characteristic width table corresponding to the retail risk exposure.
Customer data corresponding to retail risk exposure is stored in the characteristic width table, and different types of retail risk exposure are stored in different characteristic width tables. Customer data includes, but is not limited to, historical account flow data, account status data, payment data, overdue records, and rating records for the customer corresponding to retail risk exposure.
The samples are customer data under corresponding retail risk exposure in the feature wide table and the default identification of the corresponding customer. For example, a default definition corresponding to retail risk exposure is used to determine whether a customer will default based on customer data, if so, the default identifier is determined to be 1, otherwise, the default identifier is determined to be 0, and the default identifier is marked on the customer data.
Illustratively, a characteristic width table corresponding to the retail risk exposure is read from the pooling characteristic library, and customer data used for building a retail risk exposure pooling model is read from the characteristic width table as a sample.
Optionally, collecting source data required for constructing a retail risk exposure pooling model from a banking source system based on the pooling frequency; and correspondingly storing the source data into a pre-configured characteristic width table according to the category of the retail risk exposure. Wherein, the pool frequency is a parameter configured by service personnel. For example, the pooling frequency may be such that the system automatically calculates a retail risk exposure pooling model at the end of each month.
Specifically, when the pooling frequency is met, various source data required for building the pooling model are collected from the banking business source system by using technical means such as Extract-Transform-Load (ETL) and the like. And establishing a corresponding characteristic width table according to historical modeling experience of business personnel, wherein different types of retail risk exposure correspond to different characteristic width tables. And storing the source data into a corresponding characteristic width table according to the category of the retail risk exposure. And constructing a risk characteristic library specially used for constructing the sub-pool model based on the characteristic width table. And when the pool-dividing frequency is satisfied again, acquiring new source data required by the establishment of the retail risk exposure pool-dividing model from the banking business source system, automatically updating the iterative risk feature library based on the new source data, and applying the iterative risk feature library to the automatic iteration of the subsequent pool-dividing model.
The sample is high-level subdivided in response to user configuration, step 120.
Specifically, a sample is subjected to first high-level subdivision in response to user configuration to obtain at least two sample sets subjected to first high-level subdivision; and responding to user configuration to perform high-level subdivision again on the samples in each sample set subjected to the high-level subdivision for the first time, so as to obtain at least two sample sets subjected to high-level subdivision again. The user can configure rules of high-level subdivision according to actual application requirements, and samples are classified based on the rules configured by the user. For example, high-level segments may be made according to categories of retail risk exposure. A subdivision of the retail risk exposure product level may also be made. It may also be subdivided according to account age, expected status, etc.
And 130, subdividing samples in the sample set obtained by high-level subdivision layer by layer according to the sub-pool risk factors, and generating a retail risk exposure sub-pool model based on the high-level subdivision result and the layer-by-layer subdivision result.
Wherein, the pool risk factor can be a characteristic index in the sample. For example, the pooling risk factors may include a scoring index, a debtor index, a debt index, a behavior information index, and the like. Wherein the scoring indexes comprise application scoring, behavior scoring, collection scoring and the like. Debtor metrics include occupation, territory, marital status, feeding population, and the like. The debt indexes include guarantee type, mortgage rate, residual loan proportion, credit rate, account age and the like. The behavior information index comprises overdue conditions of a past period of time, payment conditions and the like.
Illustratively, configuration information of the pool risk factors divided by the user is obtained, and the configuration information of the pool risk factors is used for carrying out layer-by-layer subdivision on samples in a sample set obtained by high-layer subdivision through a decision tree algorithm.
For the result of the pooling, the user can also configure to continue pooling the result of the pooling by respectively adopting the available credit line, the account type, the account status, and the like. For example, for the above-mentioned pooling result, the user configures to continue pooling based on whether the available credit line exceeds a set value. As another example, for the above-described continue pooling result, the user configuration proceeds with continuing pooling based on student cards, employee cards, VIP accounts, and other accounts. As another example, for the continue pooling result, the user configuration proceeds with continuing pooling based on the active account and the dormant account.
Illustratively, after the high-level subdivision is completed, the sample set after the high-level subdivision is split, and a training set and a test set are generated. The training set is used to generate a retail risk exposure pooling model. The test set is used to validate the retail risk exposure pooling model.
And continuously pooling the samples in the training set based on various scores, borrower information, debt item information, overdue information, behavior summary information, collection urging information and the like by adopting a card-square self-interaction tree CHAID algorithm. And respectively generating a decision tree based on the incidence relation between the samples in the high-level subdivision result and the layer-by-layer subdivision result, and displaying the decision tree by taking the decision tree as a retail risk exposure sub-pool model. It should be noted that, in the displaying process, besides displaying the tree result, the statistical information of each tree node is also synchronously displayed. Optionally, after the decision tree is displayed, the method further includes obtaining a pruning operation of the user on the decision tree, and cutting the decision tree in response to the pruning operation to generate a retail risk exposure pooling model after pruning.
Optionally, if the user selects to manually train the decision tree, after the high-level subdivision is finished, the feature index selected by the user is obtained, the samples in the training set are automatically subjected to bottom-level subdivision based on the feature index selected by the user to generate the decision tree, and the decision tree is used as a retail risk exposure sub-pool model to be displayed. It should be noted that, in the displaying process, besides displaying the tree result, the statistical information of each tree node is also synchronously displayed. Optionally, after the decision tree is displayed, the method further includes obtaining a pruning operation of the user on the decision tree, and cutting the decision tree in response to the pruning operation to generate a retail risk exposure pooling model after pruning.
According to the technical scheme, the samples for constructing the model are obtained, high-level subdivision is carried out on the samples in response to user configuration, the samples in the sample set obtained by the high-level subdivision are subdivided layer by layer according to the sub-pool risk factors, and the retail risk exposure sub-pool model is generated based on the high-level subdivision result and the sub-pool by layer subdivision result. The embodiment of the invention realizes the standardization and automation of the construction of the retail risk exposure sub-pool model, and greatly shortens the development time of the retail risk exposure sub-pool model; in the aspect of model iteration, iteration can be performed in response to user configuration so as to quickly respond to service requirements, and the model iteration efficiency is improved; and the risk of the modeling process is controlled to a certain extent, and the error caused by manual operation is reduced.
On the basis of the technical scheme, before the sample for constructing the model in the feature width table corresponding to the retail risk exposure is obtained, the embodiment of the invention also comprises the technical feature of calling the time sequence analysis calculation interface to calculate the bad account rate in the set period.
Fig. 2 is a flowchart of another retail risk exposure pooling method according to an embodiment of the present invention, as shown in fig. 2, the method includes:
and step 210, acquiring a data object in the wide characteristic table corresponding to the retail risk exposure.
Wherein the data object comprises source data and a default identification.
Illustratively, reading a feature width table corresponding to retail risk exposure from a risk feature library, calling a feature analysis interface, performing statistical analysis on field contents corresponding to feature fields of the feature width table according to a method included by the feature analysis interface, and displaying a statistical analysis result. For example, the feature analysis interface includes a statistical analysis method of the loss rate, and the like. The business personnel determine the presentation period and default definition of the feature width table based on the supervision requirements and the statistical analysis results. And determining default identifications of field contents in the feature width table through ETL according to default definitions, and storing the default identifications in association with the feature width table. The data object with the default identification is read from the wide list of features corresponding to retail risk exposures.
And step 220, calling a time sequence analysis calculation interface to calculate the bad account rate in the set period of the data object, and obtaining the incidence relation between the bad account rate and time.
And step 230, displaying the incidence relation of the bad account rate and the time to instruct business personnel to select a sample for constructing the model from the data objects based on the incidence relation.
Illustratively, the change situation of the bad account rate with time is displayed, so that a service person can judge which data objects at time points can be used as modeling samples and out-of-time samples according to the change situation. The out-of-time samples are samples which do not overlap with the modeling samples, and the out-of-time samples are used for model verification.
Alternatively, if the sample size of the modeled samples exceeds a set threshold, sample sampling is required. And acquiring a sampling rule configured by a service person, and calling a sample processing interface to sample the data object based on the sampling rule. Wherein the sampling rule may be a sampling ratio. For example, in response to a sampling proportion input by a service person, the sample processing interface is invoked to sample the modeling sample according to the sampling proportion.
And step 240, obtaining a sample for constructing the model in the characteristic width table corresponding to the retail risk exposure.
Step 250, high-level subdivision of the sample is performed in response to user configuration.
And 260, subdividing the samples in the sample set obtained by high-level subdivision layer by layer according to the sub-pool risk factors, and generating a retail risk exposure sub-pool model based on the high-level subdivision result and the layer-by-layer subdivision result.
The technical scheme of the embodiment also provides an intuitive sample screening mode, and by performing time sequence analysis on the data objects in the characteristic wide table and displaying the time sequence analysis result, business personnel can quickly select a sample for constructing the model from the data objects based on the time sequence analysis result, so that the sample construction efficiency and accuracy are improved.
On the basis of the above technical solution, the embodiment of the present invention further includes the following additional technical features after performing a first high-level subdivision on a sample in response to user configuration to obtain at least two sample sets after the first high-level subdivision: and calculating the evaluation index of the characteristic field of the sample after the first high-level subdivision, and screening the characteristic field contained in the sample in each sample set based on the evaluation index.
Fig. 3 is a flowchart of another retail risk exposure pooling method according to an embodiment of the present invention, as shown in fig. 3, the method includes:
and step 310, acquiring a data object in the wide characteristic table corresponding to the retail risk exposure.
And step 320, calling a time sequence analysis calculation interface to calculate the bad account rate in the set period of the data object, and obtaining the incidence relation between the bad account rate and time.
And step 330, displaying the incidence relation of the bad account rate and the time to instruct business personnel to select a sample for constructing the model from the data object based on the incidence relation.
And step 340, obtaining a sample for constructing the model in the characteristic width table corresponding to the retail risk exposure.
And 350, responding to user configuration to perform first high-level subdivision on the samples to obtain at least two sample sets subjected to first high-level subdivision.
And 360, calculating evaluation indexes of the characteristic fields of the samples after the first high-level subdivision, and screening the characteristic fields contained in the samples in each sample set based on the evaluation indexes.
Wherein, the evaluation index of the characteristic field is used for evaluating the performance of the characteristic field. For example, the evaluation index may be a divergence-to-information ratio (IV) index, a KS index, an AUC index, a GINI index, and the like.
Illustratively, equal frequency and equal width binning interfaces are invoked to bin the characteristic fields of the samples in each sample set. Optionally, after data is subjected to binning, before evaluation indexes of the feature fields in each bin are respectively calculated, obtaining field contents of which the feature fields in the samples in the sample set after high-level subdivision are application scores or behavior scores; and mapping the field content according to a preset rule so as to enable the value of the field content to be within a preset value interval.
And respectively calculating evaluation indexes of the characteristic fields in each sub-box, and screening the characteristic fields contained in the samples in each sample set based on the characteristic indexes. For example, when the characteristic index meets a preset requirement, the characteristic field corresponding to the corresponding characteristic index in the sample is reserved for subsequent generation of the decision tree. And when the characteristic index does not meet the preset requirement, deleting the characteristic field corresponding to the corresponding characteristic index in the sample.
And 370, responding to the user configuration, performing high-level subdivision on the samples in each screened sample set again to obtain at least two sample sets subjected to high-level subdivision again.
And 380, subdividing the samples in the sample set obtained by the high-level subdivision again layer by layer according to the sub-pool risk factors, and generating a retail risk exposure sub-pool model based on the high-level subdivision result and the layer-by-layer subdivision result.
The technical scheme of the embodiment also provides a screening mode of the sample characteristic fields, the evaluation indexes of the characteristic fields of the samples after the first high-level subdivision are additionally calculated, the step of screening the characteristic fields contained in the samples in each sample set based on the evaluation indexes is carried out, the characteristic fields meeting the requirements in the samples are screened out and used for subsequently generating the retail risk exposure sub-pool model, the data volume of the model generation step is simplified, and the model generation and iteration efficiency is improved.
On the basis of the technical scheme, after the sample used for constructing the model in the feature width table corresponding to the retail risk exposure is obtained, the embodiment of the invention also comprises a maturity analysis interface, and the sample is subjected to maturity effect analysis by adopting a cross-section method to determine the feature of the maturity effect point.
Fig. 4 is a flowchart of another retail risk exposure pooling method according to an embodiment of the present invention, as shown in fig. 4, the method includes:
and step 410, obtaining a sample for constructing the model in the characteristic width table corresponding to the retail risk exposure.
And step 420, calling a maturity analysis interface, analyzing the maturity effect of the sample by adopting a cross-section method, and displaying an analysis result.
And 430, determining a maturity effect point according to the analysis result and the maturity effect point definition, wherein the maturity effect point is used for calculating a maturity adjustment parameter.
Step 440, high-level subdividing the sample in response to the user configuration.
And 450, subdividing the samples in the sample set obtained by high-level subdivision layer by layer according to the sub-pool risk factors, and generating a retail risk exposure sub-pool model based on the high-level subdivision result and the layer-by-layer subdivision result.
Step 460, calculate the long-term average for each retail risk exposure sub-pool.
Illustratively, the long-term average default probability of the retail risk exposure sub-pool is evaluated by means of weighted averaging.
And 470, calling a maturity adjustment parameter calculation interface, and calculating the maturity adjustment parameter of each retail risk exposure sub-pool based on the maturity effect points.
And 480, calibrating the risk parameters of each retail risk exposure sub-pool according to the long-term average value and the maturity adjustment parameters to obtain a risk parameter calibration result.
It should be noted that the pooling model of retail risk exposure enables the grouping of "homogenous" retail risk exposures within the same scoring compartment into one pool. The process of estimating the probability of each pool violation is referred to as calibration. If the number of the accounts in the pool is assumed to be m to represent a random variable (default is about 1, or not 0) of default or default of each account in the pool, then:
P(Xn=1)=P
P(Xn=0)=1-P
according to Bernoulli's law of numbers, when the number of accounts in a pool is sufficiently large, the actual default rate of each pool converges on P with a probability, and therefore, the actual default rate of each pool can be used as a representative value of the pool default probabilities.
The preliminary estimated pool default probability is the default probability in the sample, and is the result obtained by using the statistical analysis technology on the selected sample, and the result may have deviation from the real default probability. In order to estimate the more accurate default probability of each sample, the default probability in the sample is adjusted by using the long-term average value and the maturity adjustment parameter.
Specifically, a long-term average of each retail risk exposure sub-pool is calculated in a default probability weighted manner. Confidence intervals are given for the long term average at a certain confidence level. Dividing a construction sample in a model development stage into a training sample and a verification sample according to a certain proportion, and taking the default probability as a suggested PD for the retail risk exposure pool with the default probability of the verification sample falling into the confidence interval of the training sample. For the retail risk exposure sub-pool where the verification sample breach probability is greater than the 95% estimated sample confidence interval upper limit of the training sample, to ensure that there is a conservative estimate, the 95% confidence interval upper limit of the long-term average breach rate is calculated as the proposed PD. For nodes that have not reserved verification samples due to too few default account numbers, the upper 95% confidence interval limit of the long-term average default rate is directly used as the proposed PD. The basic requirements for the probability of breach estimation according to the new basel capital agreement: the probability of breach (also referred to as the PD estimate) is the greater of the 0.03% and the one-year-life probability of breach corresponding to the intra-pool rating for retail risk exposure.
And calling a maturity adjustment parameter calculation interface, and calculating the maturity adjustment parameter of each retail risk exposure sub-pool based on the maturity effect points. Wherein the content of the first and second substances,
Figure BDA0003329988570000131
wherein the content of the first and second substances,
Figure BDA0003329988570000132
wherein, the default number i represents the default account number of the ith account age before the maturity effect point, and the total account number i represents the total account number of the ith account age before the maturity effect point.
And calculating a multiplication result of the maturity adjustment parameter and the default probability (namely the PD estimation value) of each retail risk exposure sublevel, and taking the multiplication result as a default probability calibration result of the corresponding retail risk exposure sublevel.
Step 490, verifying heterogeneity, judiciousness, accuracy, concentration, discriminative power and stability of the retail risk exposure pooling model.
Illustratively, according to the default probability calibration result, the retail risk exposure sub-pool model and the quantitative verification index of the model entering variable are respectively calculated. And respectively comparing the quantitative verification index with the standard index, and verifying the heterogeneity, the judiciousness, the accuracy, the concentration, the distinguishing capacity and the stability of the retail risk exposure sub-pool model according to the comparison result.
Optionally, the default probabilities are respectively calculated through retail risk exposure sub-pool models generated by different business personnel by adopting different modeling schemes, and the product capital occupation is calculated based on the default probabilities respectively.
According to the technical scheme of the embodiment, the risk parameters in the sample are adjusted by utilizing the long-term average value and the maturity adjusting parameters, so that the calculation result of the retail risk exposure sub-pool model is closer to the risk parameters under the real condition.
In a specific implementation mode, the embodiment of the invention standardizes and systematizes the modeling process of the retail risk exposure sub-pool model, combs the modeling process of the retail risk exposure sub-pool, standardizes the repeated analysis steps in the modeling process, combines the operation steps capable of completing certain functions, abstracts the operation steps into an independent function module, replaces the original modeling personnel to manually write sql or sas for analysis in a computer automatic calculation mode, and greatly improves the iteration efficiency of the model.
In addition, a data set and an intermediate result opinion document generated in the modeling process are integrated to a unified platform for management, are integrated according to a project flow mode, and a processing flow is packaged into a tool module to provide convenience for subsequent model verification, so that the modeling flow and the model verification have certain continuity and consistency, the key point of model management is more prominent, and the verification process is clearer and more transparent.
Specifically, the embodiment of the invention comprises the following steps: the system comprises a characteristic variable library module, a modeling data set determining module, a model training data set analyzing module, a model tuning module, a parameter calibrating module and a model verifying module.
The characteristic variable library module builds a risk characteristic library specially used for constructing default probability sub-pool models through technical means such as data ETL and the like based on sub-pool modeling experience of business personnel and data analysis results, automatically updates an iterative risk characteristic library, and is applied to automatic iteration of subsequent sub-pool models.
Specifically, the characteristic variable library module collects various source data required by constructing a pooling model from a banking business source system through technical means such as data ETL and the like; establishing a corresponding characteristic width table in advance according to historical modeling experience of business personnel; and correspondingly storing the source data into a characteristic width table according to the category of retail risk exposure, and providing a modeling sample for a subsequent functional module as a data interface.
It should be noted that the source data includes, but is not limited to, historical account running data, account status data, payment data, overdue records, score records, etc. corresponding to the customer under the product according to different exposure. The data ETL technology includes various ways of synchronizing data directly from a source system, data file transfer loading, and data warehouse-by-data.
And the modeling data determining module determines a series of key definitions of the sub-pool model through the analysis of the data, and lays a foundation for subsequent modeling. Wherein the key definitions include default definitions, presentation periods, selection of modeling time points, selection of out-of-time verification time points, and the like.
Specifically, the modeling data determination module reads data from a feature wide table of the feature variable library module, performs basic analysis and comparative analysis on the data, determines key definitions of the model, and determines a modeling sample set and an out-of-time sample set on the basis of a time sequence analysis result.
Fig. 5 is a schematic processing flow diagram of a modeling data determination module according to an embodiment of the present invention. As shown in fig. 5, the process flow includes:
and step 510, reading a corresponding feature width table from the sub-pool feature library as an object of subsequent analysis work.
And step 520, calling a characteristic analysis interface, and performing statistical analysis on the characteristics of the characteristic width table.
For example, the characteristic fields of the characteristic width table are subjected to statistical analysis, the missing rate of the characteristic fields is analyzed, and the statistical analysis result is displayed and used as a reference for subsequent work.
Step 530, based on the supervision requirement and the statistical analysis result, the business personnel determines the expression period and default definition of the model, and marks the features in the feature width table through ETL according to the default definition to be used as a data object for subsequent analysis.
And 540, calling a time sequence analysis calculation interface, performing sample time sequence analysis on the data object, and displaying the sample distribution conditions at different time points.
And 550, calling a sample processing interface to generate a modeling sample set and a time-out sample set according to the sample distribution condition.
Alternatively, for the case of a large number of samples, the necessary sampling work may be selected in consideration of the complexity of the calculation. For example, the samples may be sampled according to a sampling ratio set by a service person.
And the model training data set analysis module performs high-level subdivision and maturity effect analysis on the modeling sample, and colleagues perform basic analysis and comparison on the characteristic fields. Wherein the modeling samples are determined in a modeling dataset determination module.
Specifically, the model training data set analysis module performs high-level subdivision and maturity effect analysis based on the modeling sample on the basis of the modeling data set determination module, determines the candidate characteristic field of the modeling sample, and provides parameters for the subsequent functional modules.
Fig. 6 is a schematic processing flow diagram of a modeling training data set analysis module according to an embodiment of the present invention. As shown in fig. 6, the process flow includes:
and step 610, reading the modeling sample set output by the modeling data set determination module.
And step 620, calling a maturity analysis interface, performing maturity effect analysis on the modeling sample set by adopting a cross-section method, displaying the maturity effect analysis result, and determining a maturity effect point.
Step 630, analyzing the characteristic fields of the characteristic width table, calculating the performance evaluation indexes of the characteristic fields by calling equal-frequency and equal-width box dividing interfaces, and manually screening the characteristic fields for subsequent decision tree generation.
And the model tuning module is used for realizing automatic training of a decision tree, manual training of the decision tree, manual pruning and the like based on a chi-square self-interaction tree CHAID algorithm to generate a pool decision tree.
Specifically, a chi-square self-interaction tree CHAID algorithm function is realized based on python, and on the basis, automatic training decision tree, manual pruning and the like are carried out to obtain a pool decision tree model.
Fig. 7 is a schematic processing flow diagram of a model tuning module according to an embodiment of the present invention. As shown in fig. 7, the process flow includes:
and 710, splitting the modeling sample set according to a set proportion to generate a training set and a test set for generating a decision tree.
Step 720, determine whether the user chooses to automatically generate the decision tree, if yes, go to step 730, otherwise go to step 750.
And 730, responding to user configuration to perform high-level subdivision, and calling a CHAID algorithm interface to automatically generate a decision tree based on a training set after the high-level subdivision is finished.
And step 740, displaying the automatically generated decision tree.
Specifically, the tree structure of the decision tree and the statistical information of the tree nodes are shown.
And 750, responding to user configuration to perform high-level subdivision, and after the high-level subdivision is finished, performing automatic bottom-level subdivision on the basis of the manually selected characteristic fields of the samples in the training set in a semi-manual mode to generate a manual training decision tree.
And 760, displaying the manually trained decision tree.
Specifically, the tree structure of the decision tree and the statistical information of the tree nodes are shown.
Step 770, responding to the artificial pruning request to cut the decision tree, and generating a final decision tree result.
And the parameter calibration module is used for calculating the long-term average value of the sub-pools based on the result analyzed by the preamble module, and adjusting the sub-pool result according to the calculated maturity adjustment coefficient to obtain the reference parameter of the sub-pool capital.
Specifically, based on the high-level segmentation result, the long-term average value of each sub-pool is calculated according to the sub-pool model generated by the model tuning module, and the sub-pool result is adjusted according to the maturity adjustment coefficient calculated by the model training data set analysis module, so that the capital parameters of the sub-pools are obtained. Wherein the pooling result is a default probability.
Fig. 8 is a schematic processing flow diagram of a parameter checking module according to an embodiment of the present invention. As shown in fig. 8, the process flow includes:
and step 810, setting parameters according to the result of the maturity effect analysis, calling a maturity adjustment parameter calculation interface, and calculating the maturity adjustment parameter.
And 820, carrying out necessary score mapping or score conversion according to the condition of applying for scoring or behavior scoring.
Step 830, calculate the long-term average of the pools over a selected time period.
And step 840, integrating the decision tree result and the long-term average value, calibrating the sub-pool parameters, and outputting a sub-pool parameter calibration result.
And the model checking module calculates and compares the quantitative verification indexes of the model and the model entering variable, and calculates the product capital occupation according to the parameters obtained by different modeling schemes.
Specifically, the model checking module calculates the quantitative verification indexes of the pool dividing module and the model entering variable respectively according to the capital parameters obtained by the parameter calibration module, and compares the quantitative verification indexes. And measuring and calculating product capital occupation according to the pool result process obtained by different modeling schemes.
Fig. 9 is a schematic processing flow diagram of a model verification module according to an embodiment of the present invention. As shown in fig. 9, the process flow includes:
and step 910, performing heterogeneity test on different pools of the modeling sample set and the out-of-event sample set respectively.
And 920, performing judicious inspection on different pools of the verification sample set and the out-of-time sample set respectively.
And 930, respectively carrying out accuracy test on different pools of the verification sample set and the out-of-time sample set.
And 940, performing model concentration degree inspection on different pools of the modeling sample set and the out-of-time sample set respectively.
And 950, calculating a model discrimination force index, and performing model discrimination force inspection based on the discrimination force index.
Wherein the discrimination power index includes KS, GINI and the like.
And step 960, evaluating the stability index of the model, and carrying out model stability inspection based on the stability index.
According to the technical scheme of the embodiment of the invention, the retail PD sub-pool modeling process is standardized and automated, the model development and iteration efficiency is greatly shortened, the business requirement can be quickly responded, the modeling process risk is controlled to a certain extent, and the manual operation error is reduced. In addition, modeling process data, process files, documents and the like are managed in a centralized manner, so that the subsequent verification work can be conveniently expanded, and the consistency and continuity of development and verification are ensured.
Fig. 10 is a block diagram of a retail risk exposure pooling device according to an embodiment of the present invention, which can perform the retail risk exposure pooling method according to any embodiment of the present invention, and construct a retail risk exposure pooling model by performing the above method. The apparatus may be implemented by software and/or hardware and configured in a computer device. As shown in fig. 10, the apparatus includes:
a sample obtaining module 1010, configured to obtain a sample for constructing a model in a feature width table corresponding to retail risk exposure;
a high-level subdivision module 1020 for high-level subdivision of samples in response to user configuration;
and the model generating module 1030 is configured to subdivide the samples in the sample set obtained by high-level subdivision according to the sub-pool risk factors layer by layer, and generate a retail risk exposure sub-pool model based on the high-level subdivision result and the layer-by-layer subdivision result.
The embodiment of the invention provides a retail risk exposure pool dividing device, which realizes standardization and automation of retail risk exposure pool dividing model construction by executing a retail risk exposure pool dividing method, and greatly shortens the retail risk exposure pool dividing model development time; in the aspect of model iteration, iteration can be performed in response to user configuration so as to quickly respond to service requirements, and the model iteration efficiency is improved; and the risk of the modeling process is controlled to a certain extent, and the error caused by manual operation is reduced.
Optionally, the apparatus further comprises:
the data storage module is used for collecting source data required by the establishment of the retail risk exposure pooling model from the banking business source system based on the pooling frequency before obtaining a sample used for establishing the model in the feature width table corresponding to the retail risk exposure; and correspondingly storing the source data into a pre-configured characteristic width table according to the category of the retail risk exposure.
Optionally, the sample includes historical account running data, account status data, payment data, overdue records, and rating records for the customer under retail risk exposure.
Optionally, the apparatus further comprises:
the time sequence analysis module is used for acquiring data objects in the characteristic width table corresponding to the retail risk exposure before acquiring samples for constructing the model in the characteristic width table corresponding to the retail risk exposure, wherein the data objects comprise source data and default identifications; calling a time sequence analysis calculation interface to calculate the bad account rate in the set period of the data object, and obtaining the incidence relation between the bad account rate and time; and displaying the incidence relation of the bad account rate and the time to instruct business personnel to select a sample for constructing the model from the data objects based on the incidence relation.
Optionally, the apparatus further comprises:
and the sample abstraction module is used for acquiring a sampling rule configured by business personnel after the incidence relation between the bad account rate and the time is displayed, and calling a sample processing interface to sample the data object based on the sampling rule.
Optionally, the high-level subdivision module 1020 is specifically configured to:
responding to user configuration to perform first high-level subdivision on the samples to obtain at least two sample sets subjected to first high-level subdivision;
and responding to user configuration to perform high-level subdivision again on the samples in each sample set subjected to the high-level subdivision for the first time, so as to obtain at least two sample sets subjected to high-level subdivision again.
Optionally, the apparatus further comprises:
and the characteristic screening module is used for calculating the evaluation indexes of the characteristic fields of the samples after the first high-level subdivision after responding to the user configuration and obtaining at least two sample sets after the first high-level subdivision, and screening the characteristic fields contained in the samples in each sample set based on the evaluation indexes.
Further, calculating an evaluation index of a feature field of the sample after the first high-level subdivision, includes:
calling equal-frequency and equal-width box-separating interfaces to perform data box separation on the characteristic fields of all samples in each sample set;
and respectively calculating the evaluation indexes of the characteristic fields in each sub-box.
Further, before the evaluation indexes of the feature fields in the respective bins are calculated, the method further includes:
acquiring field contents of which the characteristic fields in the samples in the sample set after the high-level subdivision are application scores or behavior scores;
and mapping the field content according to a preset rule so as to enable the value of the field content to be within a preset value interval.
Optionally, the apparatus further comprises:
the maturity analysis module is used for calling a maturity analysis interface after acquiring a sample used for constructing the model in the characteristic width table corresponding to the retail risk exposure, performing maturity effect analysis on the sample by adopting a cross-section method, and displaying an analysis result; and determining a maturity effect point according to the analysis result and the maturity effect point definition, wherein the maturity effect point is used for calculating a maturity adjustment parameter.
Optionally, the apparatus further comprises:
the model calibration module is used for calculating the long-term average value of each retail risk exposure sublevel after the retail risk exposure sublevel model is generated based on the high-level subdivision result and the layer-by-layer subdivision result; calling a maturity adjusting parameter calculating interface, and calculating the maturity adjusting parameter of each retail risk exposure sub-pool based on the maturity effect points; and respectively calibrating the risk parameters of each retail risk exposure sub-pool according to the long-term average value and the maturity adjustment parameters to obtain a risk parameter calibration result.
Optionally, the apparatus further comprises:
and the model verification module is used for verifying the heterogeneity, the judiciousness, the accuracy, the concentration, the distinguishing capability and the stability of the retail risk exposure sub-pool model after the retail risk exposure sub-pool model is generated based on the high-level subdivision result and the layer-by-layer subdivision result.
The retail risk exposure pooling device provided by the embodiment of the invention can execute the retail risk exposure pooling method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Fig. 11 is a schematic structural diagram of a computer apparatus according to an embodiment of the present invention, as shown in fig. 11, the computer apparatus includes a processor 1110, a memory 1120, an input device 1130, and an output device 1140; the number of processors 1110 in the computer device may be one or more, and one processor 1110 is taken as an example in fig. 11; the processor 1110, the memory 1120, the input device 1130, and the output device 1140 in the computer apparatus may be connected by a bus or other means, and the connection by the bus is exemplified in fig. 11.
The memory 1120, which may be a computer-readable storage medium, may be used to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the method for pooling retail risk exposure in embodiments of the present invention (e.g., the sample acquisition module 1010, the high-level segmentation module 1020, and the model generation module 1030 in the device for pooling retail risk exposure). Processor 1110, by executing software programs, instructions and modules stored in memory 1120, performs various functional applications and data processing of the computer device, i.e., implementing the retail risk exposure pooling method described above.
The memory 1120 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 1120 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 1120 may further include memory located remotely from the processor 1110, which may be connected to a computer device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 1130 may be used to receive numeric or character information input by a user and generate key signal inputs related to user settings and function control of the computer apparatus. The output device 1140 may include a display device such as a display screen for displaying the timing analysis result, the decision tree structure, the statistical information of the tree nodes, and the like.
Embodiments of the present invention also provide a storage medium containing computer-executable instructions which, when executed by a computer processor, perform a method of pooling retail risk exposure, the method comprising:
obtaining a sample for constructing a model in a characteristic width table corresponding to retail risk exposure;
performing high-level subdivision on the sample in response to user configuration;
and subdividing samples in the sample set obtained by high-level subdivision layer by layer according to the sub-pool risk factors, and generating a retail risk exposure sub-pool model based on the high-level subdivision result and the layer-by-layer subdivision result.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the above method operations, and may also perform related operations in the retail risk exposure pooling method provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the retail risk exposure pooling device, the units and modules included in the embodiment are only divided according to the functional logic, but are not limited to the above division as long as the corresponding functions can be realized; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (15)

1. A method of pooling retail risk exposure, comprising:
obtaining a sample for constructing a model in a characteristic width table corresponding to retail risk exposure;
performing a high-level refinement on the sample in response to a user configuration;
and subdividing samples in the sample set obtained by high-level subdivision layer by layer according to the sub-pool risk factors, and generating a retail risk exposure sub-pool model based on the high-level subdivision result and the layer-by-layer subdivision result.
2. The method of claim 1, further comprising, prior to obtaining a sample for constructing a model in a feature width table corresponding to retail risk exposure:
acquiring source data required by constructing a retail risk exposure sub-pool model from a banking business source system based on the sub-pool frequency;
and correspondingly storing the source data into a pre-configured characteristic width table according to the category of the retail risk exposure.
3. The method of claim 1, wherein the sample comprises historical account running data, account status data, payment data, overdue records, and rating records for the customer corresponding to the retail risk exposure.
4. The method of claim 1, further comprising, prior to obtaining a sample for constructing a model in a feature width table corresponding to retail risk exposure:
acquiring a data object in a feature width table corresponding to retail risk exposure, wherein the data object comprises source data and default identification;
calling a time sequence analysis calculation interface to calculate the bad account rate in the set period of the data object, and obtaining the incidence relation between the bad account rate and time;
and displaying the incidence relation of the bad account rate and the time so as to instruct business personnel to select a sample for constructing the model from the data objects based on the incidence relation.
5. The method of claim 4, further comprising, after displaying the bad-account rate-time association,:
and acquiring a sampling rule configured by a service person, and calling a sample processing interface to sample the data object based on the sampling rule.
6. The method of claim 1, wherein said high-level subdividing the sample in response to user configuration comprises:
responding to user configuration to perform first high-level subdivision on the samples to obtain at least two sample sets subjected to first high-level subdivision;
and responding to user configuration to perform high-level subdivision again on the samples in each sample set subjected to the high-level subdivision for the first time, so as to obtain at least two sample sets subjected to high-level subdivision again.
7. The method of claim 6, further comprising, after first high-level subdividing the samples in response to a user configuration resulting in at least two first high-level subdivided sets of samples:
and calculating evaluation indexes of the characteristic fields of the samples after the first high-level subdivision, and screening the characteristic fields contained in the samples in each sample set based on the evaluation indexes.
8. The method of claim 7, wherein the calculating the evaluation index of the feature field of the first high-layer subdivided sample comprises:
calling equal-frequency and equal-width binning interfaces to perform data binning on the characteristic fields of the samples in each sample set;
and respectively calculating the evaluation indexes of the characteristic fields in each sub-box.
9. The method according to claim 8, further comprising, before calculating the evaluation index of the characteristic field in each bin respectively:
acquiring field contents of which the characteristic fields in the samples in the sample set after the high-level subdivision are application scores or behavior scores;
and mapping the field content according to a preset rule so as to enable the value of the field content to be within a preset value interval.
10. The method of claim 1, further comprising, after obtaining a sample for constructing a model in a feature width table corresponding to retail risk exposure:
calling a maturity analysis interface, analyzing the maturity effect of the sample by adopting a cross-section method, and displaying an analysis result;
and determining a maturity effect point according to the analysis result and the maturity effect point definition, wherein the maturity effect point is used for calculating a maturity adjustment parameter.
11. The method of claim 10, further comprising, after generating a retail risk exposure pooling model based on the high-level segmentation results and the layer-by-layer segmentation results:
calculating a long-term average for each retail risk exposure sub-pool;
calling a maturity adjusting parameter calculating interface, and calculating the maturity adjusting parameter of each retail risk exposure sub-pool based on the maturity effect points;
and respectively calibrating the risk parameters of each retail risk exposure sub-pool according to the long-term average value and the maturity adjustment parameters to obtain a risk parameter calibration result.
12. The method of claim 1, further comprising, after generating a retail risk exposure pooling model based on the high-level segmentation results and the layer-by-layer segmentation results:
verifying heterogeneity, judiciousness, accuracy, concentration, discriminative power and stability of the retail risk exposure pooling model.
13. A retail risk exposure pooling device comprising:
the sample acquisition module is used for acquiring a sample for constructing a model in a characteristic width table corresponding to retail risk exposure;
a high-level subdivision module for high-level subdivision of the sample in response to user configuration;
and the model generation module is used for subdividing the samples in the sample set obtained by high-level subdivision layer by layer according to the sub-pool risk factors and generating a retail risk exposure sub-pool model based on the high-level subdivision result and the layer-by-layer subdivision result.
14. A computer device, characterized in that the computer device comprises:
one or more processors;
a memory for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method for pooling retail risk exposure of any of claims 1-12.
15. A storage medium containing computer-executable instructions for performing the method of pooling retail risk exposure of any of claims 1-12 when executed by a computer processor.
CN202111276004.9A 2021-10-29 2021-10-29 Retail risk exposure pool dividing method and device, computer equipment and medium Pending CN114004491A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111276004.9A CN114004491A (en) 2021-10-29 2021-10-29 Retail risk exposure pool dividing method and device, computer equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111276004.9A CN114004491A (en) 2021-10-29 2021-10-29 Retail risk exposure pool dividing method and device, computer equipment and medium

Publications (1)

Publication Number Publication Date
CN114004491A true CN114004491A (en) 2022-02-01

Family

ID=79925967

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111276004.9A Pending CN114004491A (en) 2021-10-29 2021-10-29 Retail risk exposure pool dividing method and device, computer equipment and medium

Country Status (1)

Country Link
CN (1) CN114004491A (en)

Similar Documents

Publication Publication Date Title
CN108564286B (en) Artificial intelligent financial wind-control credit assessment method and system based on big data credit investigation
CN105718490A (en) Method and device for updating classifying model
CN109543925B (en) Risk prediction method and device based on machine learning, computer equipment and storage medium
CN111951097A (en) Enterprise credit risk assessment method, device, equipment and storage medium
CN110349000A (en) Method, apparatus and electronic equipment are determined based on the volume strategy that mentions of tenant group
AU2019100968A4 (en) A Credit Reporting Evaluation System Based on Mixed Machine Learning
CN110909970A (en) Credit scoring method and device
CN114511019A (en) Sensitive data classification and grading identification method and system
CN111199469A (en) User payment model generation method and device and electronic equipment
CN111652661B (en) Mobile phone client user loss early warning processing method
CN111178633A (en) Method and device for predicting scenic spot passenger flow based on random forest algorithm
CN112150094A (en) Model training method, model-based evaluation method and model-based evaluation device
CN110378739B (en) Data traffic matching method and device
CN111210332A (en) Method and device for generating post-loan management strategy and electronic equipment
CN113919432A (en) Classification model construction method, data classification method and device
CN116911994B (en) External trade risk early warning system
CN110059749B (en) Method and device for screening important features and electronic equipment
CN114004491A (en) Retail risk exposure pool dividing method and device, computer equipment and medium
Wirawan et al. Application of data mining to prediction of timeliness graduation of students (a case study)
CN111489134A (en) Data model construction method, device, equipment and computer readable storage medium
CN112232944B (en) Method and device for creating scoring card and electronic equipment
CN113298448B (en) Lease index analysis method and system based on Internet and cloud platform
CN116777597A (en) Financial risk assessment method, device, storage medium and computer equipment
CN116523628A (en) Credit model definition method based on public credit big data
CN116151956A (en) Credit approval processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination