CN113642923A - Bad asset pack value evaluation method based on historical collection urging data - Google Patents

Bad asset pack value evaluation method based on historical collection urging data Download PDF

Info

Publication number
CN113642923A
CN113642923A CN202111004797.9A CN202111004797A CN113642923A CN 113642923 A CN113642923 A CN 113642923A CN 202111004797 A CN202111004797 A CN 202111004797A CN 113642923 A CN113642923 A CN 113642923A
Authority
CN
China
Prior art keywords
variables
model
asset
bad
bad asset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111004797.9A
Other languages
Chinese (zh)
Inventor
庄涤坤
刘建新
赵雪
黄平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jianyuan Heguang Beijing Technology Co ltd
Original Assignee
Jianyuan Heguang Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jianyuan Heguang Beijing Technology Co ltd filed Critical Jianyuan Heguang Beijing Technology Co ltd
Priority to CN202111004797.9A priority Critical patent/CN113642923A/en
Publication of CN113642923A publication Critical patent/CN113642923A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Abstract

The invention discloses a bad asset pack value evaluation method based on historical collection urging data, which comprises the following steps: step 1, constructing a bad asset collection prediction model; step 2, ordering the importance of the variables; step 3, a plurality of most important variables are subjected to union calculation to obtain a most important variable set; step 4, obtaining newly constructed derivative variables and construction rules by using the variables in the most important variable set; step 5, sampling and creating a plurality of virtual resource packages, and obtaining a derivative variable of each virtual resource package according to the step 4; step 6, combining the derived variables in the step 5 with the total amount of the recovery amount promoted by the virtual asset package to obtain a bad asset package value evaluation model; and 7, creating derivative variables of the bad asset pack to be estimated, substituting the derivative variables into the bad asset value evaluation model in the step 6, and predicting the value of the asset pack. The invention abstracts an asset package characteristic which is crucial to the model construction, and ensures the accuracy of the undesirable asset multi-factor regression model.

Description

Bad asset pack value evaluation method based on historical collection urging data
Technical Field
The invention relates to the technical field of asset assessment, in particular to a bad asset package price value assessment method based on historical collection urging data.
Background
The financial bad assets refer to various equity, debt and physical assets held by the card holding financial institutions such as commercial banks and the like, which can not bring normal economic benefits to the card holding financial institutions. The financial bad asset disposal modes mainly comprise modes of litigation clearing, debt reorganization, debt right transfer, debt transfer, asset securitization and the like. The method can not carry out reasonable evaluation pricing on the poor assets in the poor asset treatment, and the evaluation becomes an important reference basis for trading between buyers and sellers on the poor asset market.
The financial undesirable asset assessment directive opinions introduced by the property assessment society of china in 2017 do not make specific requirements on the value type, the assessment method and the specific assessment process in the assessment, and only the seventeenth article, namely the property assessment professional should make clear the basic situation of the property assessment business and properly select the value type and the assessment method according to the assessment purpose, the assessment object, the asset disposal mode, the available assessment data and other factors, has framework requirements and is not really instructive. Currently, the market has no mature method for the valuation of bad asset transfer, and currently, an evaluation organization cannot take out a mature valuation report in a short term. This also results in a large randomness and uncertainty in the price of poor asset transfer on the market.
While a bad asset pack typically contains many bad asset cases, each case being of a very different condition and physical nature. In the bad portfolio valuation process, due to the asymmetry of the information of the buyer and the seller, the relatively perfect financial information of the deficient debtors and the future income, the variable present value of the debt depends on the actual financial condition and repayment willingness of each debtor.
The current bad asset pack estimation method mainly comprises the following steps:
1) static cash flow posting model: the key to this method is the determination of interest rates and cash flows. In the actual operation of the static cash flow chargeback model, the biggest difficult problems are the determination of future cash flow and the prediction of future interest rate trend. The texture and cash flow of individual cases is very difficult to judge and define since detailed knowledge and asset attribute quantification of debtors to each case is not possible during bad asset trading. Therefore, the method has no great practical significance for evaluation in the transaction process;
2) the Monte Carlo simulation is a calculation method based on probability theory and statistical theory. The basic principle is as follows: the method comprises the steps of simulating various cash flow paths by taking the initial price of the asset as a starting point under the condition of considering advance repayment and default, obtaining cash flow under each path, then pasting, and carrying out weighted average on the pasted values under all paths to obtain the theoretical price of the asset. This approach is also limited by whether there is a possibility of access to cash flows during the trading of undesirable assets;
3) a multi-factor regression model is established for the sample data of the bad asset package, the factors influencing the final value of the bad asset package are summarized through summarizing and summarizing the historical bad asset package, and then the factors are subjected to regression analysis by using a statistical model on the basis to establish the regression model. The multi-factor regression analysis adopts a statistical analysis method, is relatively suitable for pricing analysis of the bad assets, but needs a large number of bad asset handling cases, namely bad asset packs, as the basis of theoretical research, and meanwhile, the accuracy of final estimation depends on variables selected when a regression equation is established to a great extent, and if the initially selected relevant factors influencing the recovery rate of the bad assets are wrong, the final result may be far away from the actual situation.
In the existing methods, for example, the method 1 and the method 2 are based on the calculation or simulation of case cash flow, and the cash flow is subject to a plurality of objective (various attributes of bad asset cases) and subjective (actual repayment willingness of debtors) factors and other factors which are difficult to be embodied in cases, such as the current actual financial condition, working stability, family burden, health condition and the like of the debtors, so that the method is difficult to be suitable for value evaluation in the bad asset pack transaction process. Therefore, it is desirable to provide a bad asset pack value assessment method based on historical revenue data.
Disclosure of Invention
An object of the present invention is to solve at least the above problems and to provide at least the advantages described later.
The invention also aims to provide a method for evaluating the value of the bad asset package based on historical collection urging data, which adopts a multi-factor regression model method to create a plurality of virtual asset packages so as to meet a large number of data samples required by machine learning model training, and simultaneously abstracts a comprehensive, integral and comprehensive property package characteristic which is important for model construction, thereby ensuring the accuracy of the bad asset multi-factor regression model.
To achieve these objects and other advantages in accordance with the purpose of the invention, there is provided a bad asset pack value assessment method based on historical revenue data, comprising:
step 1, respectively constructing a plurality of different machine learning models according to existing bad asset cases and the recovery urging results thereof, and carrying out classification model training to obtain a bad asset collection urging prediction model.
And 2, sequencing the importance of the variables according to the influence effect of the variable selected by constructing the bad asset collection prediction model on the bad asset collection prediction model to obtain the most important variable subset influencing the recovery of the bad asset case.
And 3, merging a plurality of most important variables in each most important variable subset to obtain a plurality of most heavy variables from the union to serve as a final most important variable set.
And 4, deriving the variables in the most important variable set into overall characteristic variables of the bad asset package to obtain newly constructed derived variables and construction rules of the derived variables required for constructing the asset package to be estimated.
And 5, sampling the existing bad asset case data information to create a plurality of virtual resource packages, and obtaining the derivative variable of each virtual resource package according to the variable in the most important variable set in the step 4 and the construction rule.
And 6, combining the derived variables in the step 5 with the total amount of the recovery promoted by the virtual asset package to obtain a bad asset package value evaluation model.
And 7, summarizing the bad asset packs to be estimated, creating derivative variables through the construction rules, substituting the derivative variables into the bad asset pack price value evaluation model in the step 6, using various machine learning algorithms, combining results predicted by different machine learning algorithms through a voting recoverer, and returning an average predicted value to obtain a bad asset pack price value prediction result.
Preferably, the target variable of the model in the step 1 is a recovery urging result, and if the poor asset case recovery urging result is "recovery urging", the value of the target variable is 1; if the bad asset case recovery result is 'not recovery', the value of the target variable is 0;
wherein, the characteristic variables of the model are all available variables in the bad asset case machine and the return result data information thereof.
Preferably, the algorithm used for training the classification model in step 1 includes: performing logistic regression; random spanning trees and XGBoost.
Preferably, the concrete method for obtaining the union in step 3 is as follows:
taking out the Top m most important features of the three algorithms respectively, and assuming that the Top m most important features of the three models are combined as follows:
a: the first m most important variables of model 1 = { a1, a 2.. am };
b: the first m most important variables of model 2 = { b1, b 2.. bm };
c: the first m most important variables of model 3 = { c1, c 2.. cm };
the union is solved as AU B U C = { x: x ∈ A (or) x ∈ B (or) x ∈ C };
wherein, the models of the three algorithms are respectively a model 1, a model 2 and a model 3;
m is the number of the most important variables selected by each algorithm;
a = { a1, a2, … am }, ai represents the ith most important variable selected by model 1, and i ranges from 1 to m;
b = { B1, B2, … bm }; bi represents the ith most important variable selected by the model 2, and the value range of i is from 1 to m;
c = { C1, C2, … cm }, ci represents the ith most important variable selected by model 3, and i ranges from 1 to m.
Preferably, the most important variables are taken from the union in step 3, wherein the most important variables are defined as: given a variable Import = Importance of the variable in Model A + Importance of the variable in Model B + Importance of the variable in Model C.
Preferably, step 4 includes deriving variables for the new construction of numerical features and deriving variables for the new construction of classification feature variables.
Preferably, the sampling in step 5 is specifically: random sampling/condition random sampling is carried out by taking a case as a unit through the recovery urging result data information of the bad asset historical case and the case related factor data information, and the sampling quantity each time is in the range of { a, b };
wherein 0 < a < b < total number of bad asset cases; the number of cases in each virtual asset pack should be not less than 1000.
Preferably, the plurality of machine learning algorithms in step 7 includes: linear regression, random spanning tree, and Gradient Boosting.
The invention at least comprises the following beneficial effects:
the method comprises the steps of expecting to recover money data through history of the bad assets, training a recovery prediction model by using a machine on the basis, sequencing the importance of model variables, selecting a plurality of variables, sampling on the basis of the history data, forming a plurality of virtual bad asset packs, producing various derived characteristic variables of the virtual asset packs according to the importance of the variables, modeling the virtual asset packs by using the machine learning model, and generating a final bad asset pack valuation model.
Firstly, a bad asset collection prediction model is constructed, and the model is used for predicting the recovery probability of each case based on a large amount of existing case data and the collection result of each case. After the model is trained, the variables influencing the effect of the model are ranked in importance, so that the most important variable subsets influencing the case return are summarized. And when the bad asset collection prediction model is established, various machine learning algorithms are adopted, so that a plurality of variable subsets are obtained. And finally, performing union set on top m variables of each subset, and taking top n variables out of the union set as a final most important variable set. And deriving the characteristic variables of the final bad resource package by using the rule according to the indexes of the variable sets so as to represent the relevant characteristics of the real resource package. Therefore, one independent bad asset case is abstracted out accurately, which is important for the model construction and has comprehensive, integral and comprehensive property of the asset pack, and the problem that the accuracy of the selected variable cannot be determined due to the fact that each asset pack comprises a plurality of different debts or cases, each case has a plurality of original variables (such as account age, principal, interest, age of debtors, sex, occupation and the like) and the difference between single cases is large is solved, and the accuracy of a bad asset multi-factor regression model is further ensured.
And meanwhile, based on historical case data of the undesirable assets and collection result data of the historical case data, random sampling/condition random sampling is carried out by taking a case as a unit, so that a plurality of virtual asset packs are created to meet a large amount of data samples required by machine learning model training. The method solves the technical problems that the estimation value data of the bad asset package available in the market at present are very few, a large amount of estimation sample data of the bad asset package is difficult to obtain, and a machine learning algorithm is difficult to construct an estimation model based on a large amount of sample data.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention.
Drawings
FIG. 1 is a flow chart of a bad asset pack value assessment method based on historical revenue collection data according to the present invention;
FIG. 2 is a graph of the importance ranking of variables according to the present invention.
Detailed Description
The present invention is further described in detail below with reference to the attached drawings so that those skilled in the art can implement the invention by referring to the description text.
It will be understood that terms such as "having," "including," and "comprising," as used herein, do not preclude the presence or addition of one or more other elements or groups thereof.
As shown in fig. 1, the present invention provides a bad asset pack value evaluation method based on historical collection data, which includes:
step 1, respectively constructing a plurality of different machine learning models according to existing bad asset cases and the recovery urging results thereof, and carrying out classification model training to obtain a bad asset collection urging prediction model.
And 2, sequencing the importance of the variables according to the influence effect of the variable selected by constructing the bad asset collection prediction model on the bad asset collection prediction model to obtain the most important variable subset influencing the recovery of the bad asset case.
And 3, merging a plurality of most important variables in each most important variable subset to obtain a plurality of most heavy variables from the union to serve as a final most important variable set.
And 4, deriving the variables in the most important variable set into overall characteristic variables of the bad asset package to obtain newly constructed derived variables and construction rules of the derived variables required for constructing the asset package to be estimated.
And 5, sampling the existing bad asset case data information to create a plurality of virtual resource packages, and obtaining the derivative variable of each virtual resource package according to the variable in the most important variable set in the step 4 and the construction rule.
And 6, combining the derived variables in the step 5 with the total amount of the recovery promoted by the virtual asset package to obtain a bad asset package value evaluation model.
And 7, summarizing the bad asset packs to be estimated, creating derivative variables through the construction rules, substituting the derivative variables into the bad asset pack price value evaluation model in the step 6, using various machine learning algorithms, combining results predicted by different machine learning algorithms through a voting recoverer, and returning an average predicted value to obtain a bad asset pack price value prediction result.
In the scheme, recovery data is urged to be recovered through the history of the bad assets, a recovery prediction model is trained by using a machine on the basis, then the importance of model variables is sequenced and a plurality of variables are selected, sampling is carried out on the basis of the history data to form a plurality of virtual bad asset packs, various derived characteristic variables of the virtual asset packs are produced according to the importance of the variables, and the machine learning model is used for modeling the virtual asset packs to generate a final bad asset pack valuation model. In which the variable importance is ranked as shown in figure 2.
Firstly, a bad asset collection prediction model is constructed, and the model is used for predicting the recovery probability of each case based on a large amount of existing case data and the collection result of each case. After the model is trained, the variables influencing the effect of the model are ranked in importance, so that the most important variable subsets influencing the case return are summarized. And when the bad asset collection prediction model is established, various machine learning algorithms are adopted, so that a plurality of variable subsets are obtained. And finally, performing union set on top m variables of each subset, and taking top n variables out of the union set as a final most important variable set. And deriving the characteristic variables of the final bad resource package by using the rule according to the indexes of the variable sets so as to represent the relevant characteristics of the real resource package. Therefore, one independent bad asset case is abstracted out accurately, which is important for the model construction and has comprehensive, integral and comprehensive property of the asset pack, and the problem that the accuracy of the selected variable cannot be determined due to the fact that each asset pack comprises a plurality of different debts or cases, each case has a plurality of original variables (such as account age, principal, interest, age of debtors, sex, occupation and the like) and the difference between single cases is large is solved, and the accuracy of a bad asset multi-factor regression model is further ensured.
And meanwhile, based on historical case data of the undesirable assets and collection result data of the historical case data, random sampling/condition random sampling is carried out by taking a case as a unit, so that a plurality of virtual asset packs are created to meet a large amount of data samples required by machine learning model training. The method solves the technical problems that the estimation value data of the bad asset package available in the market at present are very few, a large amount of estimation sample data of the bad asset package is difficult to obtain, and a machine learning algorithm is difficult to construct an estimation model based on a large amount of sample data.
Creating virtual asset package data required by model training, creating characteristic variables required by model training of the virtual asset package, and taking the total amount of the total urging refunds of all cases in the virtual asset package as the final value of the asset package. Based on the three, a bad asset package value evaluation model is constructed.
When a new bad asset pack is predicted by using the bad asset pack price evaluation model, firstly summarizing all cases in the asset pack according to the method in the step 4, and establishing a derivative variable according to a rule; asset pack value prediction is then performed using the model constructed in step 7.
In a preferred scheme, the target variable of the model in the step 1 is a drive-back result, and if the poor asset case drive-back result is 'drive-back', the value of the target variable is 1; if the bad asset case recovery result is 'not recovery', the value of the target variable is 0;
wherein, the characteristic variables of the model are all available variables in the bad asset case machine and the return result data information thereof.
In the above scheme, all available variables in the data, such as: debtor locale, gender, age, occupation, education level, income status, capital for undesirable assets, capital for paid, capital for remaining, interest, fines, date of last payment, etc.
In a preferred embodiment, the algorithm used for training the classification model in step 1 includes: performing logistic regression; random spanning trees and XGBoost.
In a preferred embodiment, the concrete method for obtaining the union in step 3 is as follows:
taking out the Top m most important features of the three algorithms respectively, and assuming that the Top m most important features of the three models are combined as follows:
a: the first m most important variables of model 1 = { a1, a 2.. am };
b: the first m most important variables of model 2 = { b1, b 2.. bm };
c: the first m most important variables of model 3 = { c1, c 2.. cm };
the union is solved as AU B U C = { x: x ∈ A (or) x ∈ B (or) x ∈ C };
wherein, the models of the three algorithms are respectively a model 1, a model 2 and a model 3;
m is the number of the most important variables selected by each algorithm;
a = { a1, a2, … am }, ai represents the ith most important variable selected by model 1, and i ranges from 1 to m;
b = { B1, B2, … bm }; bi represents the ith most important variable selected by the model 2, and the value range of i is from 1 to m;
c = { C1, C2, … cm }, ci represents the ith most important variable selected by the model 3, and i ranges from 1 to m;
the 2 nd x represents all members in a, the 3 rd x represents all members in B, and the 4 th x represents all members in C.
In a preferred embodiment, step 3 takes the most weight variables from the union, wherein the most important variables are defined as: given a variable Import = Importance of the variable in Model A + Importance of the variable in Model B + Importance of the variable in Model C.
In a preferred embodiment, step 4 includes deriving variables of the new configuration of the numerical characteristic variables and deriving variables of the new configuration of the classification characteristic variables.
In the above scheme, for numerical features (e.g., principal, remaining principal), the newly constructed derivative variables include: 1. summing; 2. average value; 3. a median value; 4. a maximum value; 5. variance;
for categorical feature variables (e.g., age group, gender, account age), newly constructed derived variables include (e.g., categorical variable is account age, the first of the variables is classified as "M12-M24", the second as "M24 +"):
1. number of cases in class 1
2. Total principal of cases for the 1 st class in the classification features
3. The case classified in the 1 st classification in the classification features is always paid
4. Total remaining principal of case for the 1 st classification in classification features
5. Total interest in case of the 1 st classification in classification features
6、… …
7. Number of cases in class 2 of classification features
8. Total principal of cases for the 2 nd classification in classification features
9. Cases classified in the 2 nd classification in the classification feature are always paid
10. Total remaining principal of case for the 2 nd classification in classification features
11. Total interest in case of the 2 nd classification in classification features
12、… …
13. In turn and so on
In a preferred embodiment, the sampling in step 5 is specifically: random sampling/condition random sampling is carried out by taking a case as a unit through the recovery urging result data information of the bad asset historical case and the case related factor data information, and the sampling quantity each time is in the range of { a, b };
wherein 0 < a < b < total number of bad asset cases; the number of cases in each virtual asset pack should be not less than 1000.
In a preferred embodiment, the plurality of machine learning algorithms in step 7 includes: linear regression, random spanning tree, and Gradient Boosting.
While embodiments of the invention have been described above, it is not limited to the applications set forth in the description and the embodiments, which are fully applicable in various fields of endeavor to which the invention pertains, and further modifications may readily be made by those skilled in the art, it being understood that the invention is not limited to the details shown and described herein without departing from the general concept defined by the appended claims and their equivalents.

Claims (8)

1. A bad asset pack value evaluation method based on historical collection urging data comprises the following steps:
step 1, respectively constructing a plurality of different machine learning models according to existing bad asset cases and recovery urging results thereof, and carrying out classification model training to obtain a bad asset urging and receiving prediction model;
step 2, according to variables selected by constructing the bad asset collection prediction model, the influence effect of the bad asset collection prediction model is sequenced, and the importance of the variables is sequenced to obtain the most important variable subset influencing the recovery of bad asset cases;
step 3, a plurality of most important variables in each most important variable subset are subjected to union collection, so that a plurality of most heavy variables are taken out from the union collection and serve as a final most important variable set;
step 4, deriving the variables in the most important variable set into overall characteristic variables of the bad asset package to obtain newly constructed derived variables and construction rules of the derived variables required for constructing the asset package to be estimated;
step 5, sampling the existing bad asset case data information to create a plurality of virtual resource packages, and obtaining a derivative variable of each virtual resource package according to the variables in the most important variable set in the step 4 and the construction rule;
step 6, combining the derived variables in the step 5 with the total amount of the recovery amount promoted by the virtual asset package to obtain a bad asset package value evaluation model;
and 7, summarizing the bad asset packs to be estimated, creating derivative variables through the construction rules, substituting the derivative variables into the bad asset pack price value evaluation model in the step 6, using various machine learning algorithms, combining results predicted by different machine learning algorithms through a voting recoverer, and returning an average predicted value to obtain a bad asset pack price value prediction result.
2. The method for evaluating the value of the bad asset pack based on the historical recovery data as claimed in claim 1, wherein the target variable of the model in step 1 is the recovery urging result, and if the recovery urging result of the bad asset case is "recovery urging", the value of the target variable is 1; if the bad asset case recovery result is 'not recovery', the value of the target variable is 0;
wherein, the characteristic variables of the model are all available variables in the bad asset case machine and the return result data information thereof.
3. The method for evaluating the value of a bad asset pack based on historical revenue-prompting data as claimed in claim 1, wherein the algorithm used for training the classification model in step 1 comprises: performing logistic regression; random spanning trees and XGBoost.
4. The method for evaluating the value of a bad asset pack based on historical collection hastening data as claimed in claim 1, wherein the specific method for the union in step 3 is as follows:
taking out the Top m most important features of the three algorithms respectively, and assuming that the Top m most important features of the three models are combined as follows:
a: the first m most important variables of model 1 = { a1, a 2.. am };
b: the first m most important variables of model 2 = { b1, b 2.. bm };
c: the first m most important variables of model 3 = { c1, c 2.. cm };
the union is solved as AU B U C = { x: x ∈ A (or) x ∈ B (or) x ∈ C };
wherein, the models of the three algorithms are named as model 1, model 2 and model 3 respectively;
m is the number of the most important variables selected by each algorithm;
a = { a1, a2, … am }, ai represents the ith most important variable selected by model 1, and i ranges from 1 to m;
b = { B1, B2, … bm }, bi represents the ith most important variable selected by model 2, and i ranges from 1 to m;
c = { C1, C2, … cm }, ci represents the ith most important variable selected by model 3, and i ranges from 1 to m.
5. The method for assessing the value of a bad asset pack based on historical revenue-inducing data as claimed in claim 1, wherein the most important variables are extracted from the union in step 3, wherein the most important variables are defined as: given a variable Import = Importance of the variable in Model A + Importance of the variable in Model B + Importance of the variable in Model C.
6. The method for assessing the value of a bad asset pack based on historical revenue-inducing data as claimed in claim 1, wherein step 4 includes deriving variables from the numerical feature new configuration and deriving variables from the classification feature variable new configuration.
7. The method for evaluating the value of a bad asset pack based on historical revenue-prompting data as claimed in claim 1, wherein the sampling in step 5 is specifically as follows: random sampling/condition random sampling is carried out by taking a case as a unit through the recovery urging result data information of the bad asset historical case and the case related factor data information, and the sampling quantity each time is in the range of { a, b };
wherein 0 < a < b < total number of bad asset cases; the number of cases in each virtual asset pack should be not less than 1000.
8. The method for assessing value of a bad asset pack based on historical revenue-prompting data as claimed in claim 1, wherein the plurality of machine learning algorithms in step 7 comprises: linear regression, random spanning tree, and Gradient Boosting.
CN202111004797.9A 2021-08-30 2021-08-30 Bad asset pack value evaluation method based on historical collection urging data Pending CN113642923A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111004797.9A CN113642923A (en) 2021-08-30 2021-08-30 Bad asset pack value evaluation method based on historical collection urging data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111004797.9A CN113642923A (en) 2021-08-30 2021-08-30 Bad asset pack value evaluation method based on historical collection urging data

Publications (1)

Publication Number Publication Date
CN113642923A true CN113642923A (en) 2021-11-12

Family

ID=78424399

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111004797.9A Pending CN113642923A (en) 2021-08-30 2021-08-30 Bad asset pack value evaluation method based on historical collection urging data

Country Status (1)

Country Link
CN (1) CN113642923A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113807618A (en) * 2021-11-19 2021-12-17 建元和光(北京)科技有限公司 Method, device and equipment for hastening receipt of bad assets based on state machine
CN114139490A (en) * 2022-02-07 2022-03-04 建元和光(北京)科技有限公司 Method, device and equipment for automatic data preprocessing
CN115187356A (en) * 2022-07-21 2022-10-14 杭州源诚科技有限公司 Debtor finance production line cable information grading model, construction method and application thereof

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113807618A (en) * 2021-11-19 2021-12-17 建元和光(北京)科技有限公司 Method, device and equipment for hastening receipt of bad assets based on state machine
CN114139490A (en) * 2022-02-07 2022-03-04 建元和光(北京)科技有限公司 Method, device and equipment for automatic data preprocessing
CN114139490B (en) * 2022-02-07 2022-08-02 建元和光(北京)科技有限公司 Method, device and equipment for automatic data preprocessing
CN115187356A (en) * 2022-07-21 2022-10-14 杭州源诚科技有限公司 Debtor finance production line cable information grading model, construction method and application thereof

Similar Documents

Publication Publication Date Title
TW530235B (en) Valuation prediction models in situations with missing inputs
Jin et al. A data-driven approach to predict default risk of loan for online peer-to-peer (P2P) lending
KR100766149B1 (en) Methods and systems for efficiently sampling portfolios for optimal underwriting
TWI235925B (en) Method and systems for finding value and reducing risk
KR100796058B1 (en) Methods and apparatus for rapid deployment of a valuation system
KR100771710B1 (en) Methods and systems for optimizing return and present value
CN113642923A (en) Bad asset pack value evaluation method based on historical collection urging data
CN106875206A (en) Acquisition of information, assessment, questionnaire method, device and server
Roy et al. A credit scoring model for SMEs using AHP and TOPSIS
KR20010102455A (en) Rapid valuation of portfolios of assets such as financial instruments
KR20030004316A (en) Methods and apparatus for simulating competitive bidding yield
JP2004500641A (en) Method and system for automatically estimating credit score evaluation value
JP2004500642A (en) Methods and systems for assessing cash flow recovery and risk
JP2003526147A (en) Cross-correlation tool to automatically calculate portfolio description statistics
CN109345050A (en) A kind of quantization transaction prediction technique, device and equipment
CN111882420A (en) Generation method of response rate, marketing method, model training method and device
CN111192161A (en) Electric power market trading object recommendation method and device
Wanke et al. Revisiting camels rating system and the performance of Asean banks: a comprehensive mcdm/z-numbers approach
Camelia et al. A Computational Grey Based Model for Companies Risk Forecasting.
TWM613536U (en) Investment risk scoring system for fund commodities
CN109829593A (en) The credit rating of target object determines method, apparatus, storage medium and electronic device
Murugan Creation of a recommendation system to recommend cryptocurrency portfolio using Association rule mining
Bakhshi et al. Developing a hybrid approach to credit priority based on accounting variables (using analytical network process (ANP) and multi-criteria decision-making)
TWI776370B (en) Investment risk scoring method and system for fund commodities
TW554276B (en) Methods, system and computer for determining a winning bid for a sealed bid auction at an optimal bid price

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211112