CN107633265B - Data processing method and device for optimizing credit evaluation model - Google Patents

Data processing method and device for optimizing credit evaluation model Download PDF

Info

Publication number
CN107633265B
CN107633265B CN201710785991.2A CN201710785991A CN107633265B CN 107633265 B CN107633265 B CN 107633265B CN 201710785991 A CN201710785991 A CN 201710785991A CN 107633265 B CN107633265 B CN 107633265B
Authority
CN
China
Prior art keywords
data
model
variables
characteristic value
borrower
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710785991.2A
Other languages
Chinese (zh)
Other versions
CN107633265A (en
Inventor
陈肖黎
贾西贝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Huaao Data Technology Co Ltd
Original Assignee
Shenzhen Huaao Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Huaao Data Technology Co Ltd filed Critical Shenzhen Huaao Data Technology Co Ltd
Priority to CN201710785991.2A priority Critical patent/CN107633265B/en
Publication of CN107633265A publication Critical patent/CN107633265A/en
Application granted granted Critical
Publication of CN107633265B publication Critical patent/CN107633265B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention relates to a data processing method and a data processing device for optimizing a credit evaluation model, wherein the method comprises the following steps: acquiring relevant information of a borrower as sample data; dividing the sample data into a training set and a test set; carrying out data modeling by using the training set to obtain a preliminary evaluation model; testing the preliminary evaluation model by using the test set; if the test result does not meet the evaluation standard, the training set and the test set are divided again, and the training of the divided training set and the test set is utilized to carry out data modeling and testing; and if the test result meets the evaluation standard, finishing the training and determining a final evaluation model. The data processing method and the data processing device for optimizing the credit evaluation model can optimize the credit evaluation model and improve the evaluation precision.

Description

Data processing method and device for optimizing credit evaluation model
Technical Field
The invention relates to the technical field of financial data processing, in particular to a data processing method and device for optimizing a credit evaluation model.
Background
At present, more personal loan software is available in the market, and target groups are different for different software. In order to reduce the risk, the repayment ability of the user needs to be evaluated, and in order to accurately lock the target client, the loan tendency of the user needs to be evaluated.
However, in the actual application process, the loan platform big data is suitable for the application of a data analyst. If some missing or invalid value occurs in the credit scoring model, the model may fail to detect successfully and then produce a biased estimate for the borrower. Also, during the startup phase, the loan company may not know what characteristics of the borrower are important in the credit scoring model. The credit scoring model from a large loan company may be too advanced to be used. Therefore, if the initial samples are small and the user data information is not complete or the data is missing, an appropriate evaluation model cannot be constructed for evaluation. For example, one of the variables of the model for evaluating the repayment ability is the payroll income, and if the payroll income of the user cannot be obtained, the repayment ability cannot be accurately evaluated.
After the credit evaluation model is constructed, how to optimize the model and improve the evaluation accuracy is a problem that needs to be solved urgently by a person skilled in the art.
Disclosure of Invention
Aiming at the defects in the prior art, the data processing method and the data processing device for optimizing the credit evaluation model can optimize the credit evaluation model and improve the evaluation precision.
In a first aspect, the present invention provides a data processing method for optimizing a credit assessment model, comprising:
acquiring relevant information of a borrower as sample data;
dividing the sample data into a training set and a test set;
carrying out data modeling by using the training set to obtain a preliminary evaluation model;
testing the preliminary evaluation model by using the test set;
if the test result does not meet the evaluation standard, the training set and the test set are divided again, and the training of the divided training set and the test set is utilized to carry out data modeling and testing;
and if the test result meets the evaluation standard, finishing the training and determining a final evaluation model.
The data processing method for optimizing the credit evaluation model, provided by the invention, divides sample data into a training set and a test set, constructs the evaluation model through the training set, tests the prediction capability of the evaluation model through the test set, reclassifies variables to obtain new model characteristic values by reclassifying the training set and the test set when the test is unqualified, realizes the optimization of the evaluation model through the cross validation method, and improves the evaluation precision. In addition, the cross validation method can effectively utilize all information in the sample data, deeply excavate the characteristics of the borrower, improve the evaluation precision of the model and solve the over-fitting problem.
Preferably, the modeling data by using the training set to obtain a preliminary evaluation model includes:
performing segmentation processing on the continuous variable in the training set by adopting a decision tree algorithm, and converting the continuous variable into a discrete variable;
classifying the discrete variables in the training set by adopting a clustering algorithm;
combining the variables according to the classification result, and determining a preliminary model characteristic value;
and performing logistic regression on the sample data of the model characteristic value to establish a preliminary evaluation model.
Preferably, before performing the logistic regression, the method further comprises:
if the model characteristic value of the borrower lacks data, the data of the model characteristic value is supplemented.
Preferably, if the model characteristic value of the borrower lacks data, the data of the model characteristic value is supplemented, and the method comprises the following steps:
if the model characteristic value of the borrower lacks data, searching a replacement variable of the model characteristic value;
and completing the data of the model characteristic value according to the searched data of the replacement variable.
Preferably, the method of determining the replacement variable comprises:
calculating Euclidean distances between variables;
the two variables with Euclidean distance smaller than the threshold value are mutual replacement variables.
Preferably, if the model characteristic value of the borrower lacks data, the data of the model characteristic value is supplemented, and the method comprises the following steps:
if the model characteristic values of the borrowers lack data, calculating the mean value or the median value of the model characteristic values of all the borrowers;
and completing the model characteristic value of the missing data of the borrower according to the calculated mean value or the calculated median value.
Preferably, the method further comprises the following steps: acquiring external statistical data;
if the model characteristic value of the borrower lacks data, the data of the model characteristic value is supplemented, and the method comprises the following steps:
and if the model characteristic value of the borrower lacks data, supplementing the model characteristic value of the borrower lacking data according to the external statistical data.
Preferably, before performing the logistic regression, the method further comprises:
calculating the information value of each variable;
checking according to a preset value threshold value, and judging whether the variable is effective or not;
no logistic regression was involved for the invalid variables.
In a second aspect, the present invention provides a data processing apparatus for optimizing a credit assessment model, comprising:
the data acquisition module is used for acquiring the related information of the borrower as sample data;
the sample dividing module is used for dividing the sample data into a training set and a test set;
the model training module is used for carrying out data modeling by utilizing the training set to obtain a preliminary evaluation model;
the model testing module is used for testing the preliminary evaluation model by utilizing the test set; if the test result does not meet the evaluation standard, the training set and the test set are divided again, and the training of the divided training set and the test set is utilized to carry out data modeling and testing; and if the test result meets the evaluation standard, finishing the training and determining a final evaluation model.
In a third aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs any of the methods described above in the first aspect.
Drawings
FIG. 1 is a flow chart of a data processing method for optimizing a credit evaluation model according to an embodiment of the present invention;
FIG. 2 is a block diagram of a data processing apparatus for optimizing a credit evaluation model according to an embodiment of the present invention;
fig. 3 is a block diagram of a model training module according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and therefore are only examples, and the protection scope of the present invention is not limited thereby.
It is to be noted that, unless otherwise specified, technical or scientific terms used herein shall have the ordinary meaning as understood by those skilled in the art to which the invention pertains.
As shown in fig. 1, the present embodiment provides a data processing method for optimizing a credit evaluation model, including:
in step S1, the information related to the borrower is acquired as sample data.
Wherein the sample data includes a continuous variable and a discrete variable. The borrower-related information, i.e., all information that may reveal a specific behavioral characteristic of the borrower, may include, but is not limited to, the following: age, payroll income, marital status, house purchase status, employment status, insurance purchase status, education status, etc., which may affect the ability of the borrower to make a loan repayment, which may affect the variable of the loan assessment. According to the type of sample data, the sample data can be divided into a continuous variable and a discrete variable, for example: data with specific numerical values and in a continuous distribution state such as age and wage income are continuous variables, and data with non-specific numerical values or discretization distribution such as education conditions are discrete variables.
The sample data of each borrower also comprises default conditions of the borrower, namely, the borrower with default is a bad client, and the borrower without default is a good client.
Step S2, dividing the sample data into training set and testing set.
Preferably, the sample data may be divided into training and test sets on a 7:3 scale.
And step S3, performing data modeling by using the training set to obtain a preliminary evaluation model.
And step S4, testing the preliminary evaluation model by using the test set.
And inputting the preliminary evaluation model according to the sample data in the test set to obtain whether the borrower is a good client or a bad client.
And step S5, if the test result does not meet the evaluation standard, the training set and the test set are re-divided, and the data modeling and testing are performed by using the re-divided training set and the test set.
And step S6, if the test result meets the evaluation standard, ending the training and determining the final evaluation model.
The method for evaluating the test result comprises the following steps: and (4) comparing the credit predicted value of the borrower output in the step (S4) with the default condition of the borrower in the sample data, judging whether the prediction is correct, counting the accuracy of the test set, and judging whether the accuracy reaches the evaluation standard.
The data processing method for optimizing the credit evaluation model provided by this embodiment divides sample data into a training set and a test set, constructs an evaluation model through the training set, tests the prediction capability of the evaluation model through the test set, reclassifies variables by repartitioning the training set and the test set when the test is not qualified, obtains a new model characteristic value, optimizes the evaluation model through the cross validation method, and improves the evaluation precision. In addition, the cross validation method can effectively utilize all information in the sample data, deeply excavate the characteristics of the borrower, improve the evaluation precision of the model and solve the over-fitting problem.
Wherein, the preferred embodiment of the step S4 includes:
step S401, a decision tree algorithm is adopted to conduct segmentation processing on the continuous variables in the training set, and the continuous variables are converted into discrete variables.
Wherein, when the default possibility prediction of the borrower and the subdivision difference between the characteristics of the borrower are large, the variable is divided into a plurality of sections, the sections are analyzed and counted respectively, and the characteristics of the borrower are more suitable for being analyzed than a single variable so as to optimize the category of the characteristics of the borrower. The continuous variable is segmented through a decision tree algorithm, the continuous variable is discretized, and the borrowers can be divided into different homogeneous subgroups so as to improve the expression of logistic regression. The decision tree algorithm may be implemented by using an existing decision tree algorithm, which is not described herein again. The embodiment preferably adopts chi-square automatic interaction detection (CHAID), which is a non-parametric decision tree method, and is effectively applied to various research fields, such as customer consumption trend in marketing, human behavior in psychology and geological landslide, and can segment continuous variables well to optimize the types of characteristics of borrowers, and when the CHAID is applied to logistic regression, the defect of nonlinearity can be overcome.
And S402, classifying the discrete variables in the training set by adopting a clustering algorithm.
The discrete variable in step S3 includes the original discrete variable in the sample data, and the discrete variable obtained through the conversion in step S2.
Wherein clustering is an unsupervised learning classifier that combines data with similar features into cluster groups, which can correlate the same features in sample data to reduce the effect of misclassification between variables. The clustering in this embodiment refers to variable clustering (also referred to as R-type clustering), which classifies variables according to sample data of each lender and finds representative elements (i.e., model feature values) in each class. By separating heterogeneous borrowers, the clustered variables can improve the prediction efficiency. Therefore, in the embodiment, the variables are classified and combined by using the clustering technology, and the feature partition of the variables can be improved to adapt to the logistic regression so as to improve the performance of the credit default prediction. The clustering algorithm can be implemented by using the existing clustering algorithm, and is not described herein again. In the implementation, clustering is performed by adopting a Ward minimum variance hierarchical method, correlation among small sample variables is found according to the minimum variance, and the small sample variables are classified into one class, so that the problem that the small sample variables in regression can hardly participate in statistical calculation is solved. For example, for some small sample categories, such as the "major" educational background, "scholars" are grouped together as a new category "above this subject".
And S403, combining the variables according to the classification result, and determining a preliminary model characteristic value.
The merging of the variables according to the classification result can be realized in the following way: and calculating the correlation among the variables in the same class, finding out a variable with the maximum correlation with other variables, using the variable as the model characteristic quantity of the class to replace other variables in the same class, and simplifying and evaluating the input variables of the model.
The model feature value is an important feature of the borrower found out to possibly cause loan default.
And S404, performing logistic regression on the sample data of the model characteristic value to establish a preliminary evaluation model.
The logistic regression has strong prediction capability and simple operability, and can conveniently realize the prediction target. The independent variable of the logistic regression is a model characteristic value, and the binary dependent variable of the logistic regression is default conditions of the borrower, namely 'good customers' and 'bad customers'. The evaluation model can be obtained by finding the relationship between the independent variable and the dependent variable through logistic regression, and the process is a general training process of logistic regression and is not repeated here.
According to the method, the continuous variables can be well segmented through decision tree classification so as to optimize the categories of characteristics of the borrowers, and the nonlinear defect can be overcome when the method is applied to logistic regression; the problem that small sample data can hardly participate in statistical calculation in logistic regression is solved through clustering, the small sample data is fully utilized, and the estimation precision of the model is improved; by combining the various algorithms, a proper model characteristic value can be mined, and the evaluation precision of the credit evaluation model is improved.
Because the source of the sample data is complex, the integrity of the sample data is difficult to guarantee, and in order to still effectively utilize the sample data for analysis when the sample data is missing, the method of the embodiment further comprises a step S405 before performing logistic regression, and if the model characteristic value of the borrower lacks data, the data of the model characteristic value is completed.
The preferred embodiment of step S405 specifically includes:
in step S511, if the model feature value of the borrower lacks data, the replacement variable of the model feature value is found.
The replacement variables have certain correlation, and the data of the replacement variables can be used for replacing under the condition that the data of one variable cannot be used, so that the sample data is supplemented, and the utilization rate of the sample data is improved.
And S512, complementing the data of the model characteristic value according to the searched data of the replacement variable.
The method for determining the replacement variable comprises the following steps:
calculating Euclidean distances between variables;
the two variables with Euclidean distance smaller than the threshold value are mutual replacement variables.
The threshold value can be determined according to actual conditions, the threshold value is not too large or too small, the substitute variable cannot be found when the threshold value is too small, and the substitute variable is not suitable when the threshold value is too large. In addition, the two variables with the minimum euclidean distance may be used as the alternative variables of the other party. When data of one variable is missing, the data pair of the variable can be replaced by the data of the variable
Wherein, another preferred embodiment of step S405 specifically includes:
in step S521, if the model feature value of the borrower lacks data, the mean value or the median value of the model feature value of all the borrowers is calculated.
And S522, complementing the model characteristic value of the borrower lacking data according to the calculated mean value or the calculated median value.
Another preferred embodiment of step S405 specifically includes: and if the model characteristic value of the borrower lacks data, supplementing the model characteristic value of the borrower lacking data according to the external statistical data.
And acquiring external statistical data in the stage of acquiring the sample data. The external statistical data refers to statistical class data, such as Shenzhen market employment rate, Shenzhen market average payroll and the like.
Not all variables will affect the final evaluation result, and in order to reduce the data throughput, it is necessary to filter out the variables that are invalid for the evaluation result before performing logistic regression, which specifically includes:
calculating the information value of each variable;
checking according to a preset value threshold value, and judging whether the variable is effective or not;
no logistic regression was involved for the invalid variables.
Judging whether the variables are valid is a step, and the variables can be evaluated before being classified so as to reduce the variables participating in clustering; or, the validity judgment can be carried out on only the variables determined as the model characteristic values, and the independent variables participating in the model establishment are further reduced.
In practical applications, the evidence weight is a logarithmic calculation where the proportion of "good" borrower features corresponds to the proportion of "bad" borrower features for assessing and comparing the relative risk of different classes of variables. The concrete calculation formula of the evidence weight is as follows:
Figure BDA0001398059180000081
here, WOE represents the proof weight of a certain characteristic variable, distgoods represents the distribution proportion of "good" borrowers in the sample data to the characteristic variable, and distbats represents the distribution proportion of "bad" borrowers in the sample data to the characteristic variable. The higher the positive value of the WOE, the lower the risk of credit default for the customer's activity, and the higher the negative value of the WOE, the higher the risk of credit default for the customer's activity. WOE can convert variables into a format of rules and information, which allows different types of variables to be in the same way. Variables can be transferred into WOE, and the freedom of small sample problems can be protected more effectively. Therefore, WOE is employed to compare different variables in a small sample data set. The information value can evaluate the prediction capability of the characteristic variables, and the specific calculation formula is as follows:
IV=(DistrGoods-DistrBads)*WOE,
wherein, IV represents the information value of a certain characteristic variable, DistrGoods represents the distribution proportion of "good" borrowers in the sample data in the characteristic variable, distbats represents the distribution proportion of "bad" borrowers in the sample data in the characteristic variable, and WOE represents the evidence weight of the characteristic variable.
As shown in fig. 2, the present embodiment provides a data processing apparatus for optimizing a credit evaluation model, based on the same inventive concept as the above-described data mining method for credit evaluation, including:
the data acquisition module is used for acquiring the related information of the borrower as sample data;
the sample dividing module is used for dividing the sample data into a training set and a test set;
the model training module is used for carrying out data modeling by utilizing the training set to obtain a preliminary evaluation model;
the model testing module is used for testing the preliminary evaluation model by utilizing the test set; if the test result does not meet the evaluation standard, the training set and the test set are divided again, and the training of the divided training set and the test set is utilized to carry out data modeling and testing; and if the test result meets the evaluation standard, finishing the training and determining a final evaluation model.
Preferably, as shown in fig. 3, the model training module specifically includes:
the first classification module is used for performing segmentation processing on the continuous variables in the training set by adopting a decision tree algorithm and converting the continuous variables into discrete variables;
the second classification module is used for classifying the discrete variables in the training set by adopting a clustering algorithm;
the variable merging module is used for merging the variables according to the classification result and determining a preliminary model characteristic value;
and the logistic regression module is used for carrying out logistic regression on the sample data of the model characteristic value to establish a preliminary evaluation model.
Preferably, the system further comprises an alternative variable module for:
calculating Euclidean distances between variables;
the two variables with Euclidean distance smaller than the threshold value are mutual replacement variables.
Preferably, the data completing module is further included for: and before carrying out logistic regression, if the model characteristic value of the borrower lacks data, complementing the data of the model characteristic value.
Preferably, the data completion module is specifically configured to:
if the model characteristic value of the borrower lacks data, searching a replacement variable of the model characteristic value;
and completing the data of the model characteristic value according to the searched data of the replacement variable.
Preferably, the data completion module is configured to:
if the model characteristic values of the borrowers lack data, calculating the mean value or the median value of the model characteristic values of all the borrowers;
and completing the model characteristic value of the missing data of the borrower according to the calculated mean value or the calculated median value.
Preferably, the data acquisition module may be further configured to acquire external statistical data; correspondingly, the data completion module is specifically configured to: and if the model characteristic value of the borrower lacks data, supplementing the model characteristic value of the borrower lacking data according to the external statistical data.
Preferably, the variable cleaning module is further included for: calculating the information value of each variable before performing logistic regression; checking according to a preset value threshold value, and judging whether the variable is effective or not; no logistic regression was involved for invalid feature variables.
The data mining device for credit evaluation provided by the embodiment and the data mining method for credit evaluation have the same inventive concept and the same beneficial effects, and are not repeated herein.
Based on the same inventive concept as the above-described data mining method for credit evaluation, the present implementation provides a computer-readable storage medium having stored thereon a computer program, characterized in that the program, when executed by a processor, implements the method as described in any of the method embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the present invention, and they should be construed as being included in the following claims and description.

Claims (7)

1. A data processing method for optimizing a credit evaluation model, comprising:
acquiring relevant information of a borrower as sample data;
dividing the sample data into a training set and a test set;
carrying out data modeling by using the training set to obtain a preliminary evaluation model;
testing the preliminary evaluation model by using the test set;
if the test result does not meet the evaluation standard, the training set and the test set are divided again, and the training of the divided training set and the test set is utilized to carry out data modeling and testing;
if the test result meets the evaluation standard, finishing the training and determining a final evaluation model;
the data modeling is carried out by utilizing the training set to obtain a preliminary evaluation model, and the preliminary evaluation model comprises the following steps:
performing segmentation processing on the continuous variable in the training set by adopting a decision tree algorithm, and converting the continuous variable into a discrete variable;
classifying the discrete variables in the training set by adopting a clustering algorithm;
combining the variables according to the classification result, and determining a preliminary model characteristic value;
performing logistic regression on the sample data of the model characteristic value to establish a preliminary evaluation model;
before performing the logistic regression, the method further comprises:
if the model characteristic value of the borrower lacks data, the data of the model characteristic value is supplemented;
if the model characteristic value of the borrower lacks data, the data of the model characteristic value is supplemented, and the method comprises the following steps:
if the model characteristic value of the borrower lacks data, searching a replacement variable of the model characteristic value;
and completing the data of the model characteristic value according to the searched data of the replacement variable.
2. The method of claim 1, wherein determining the replacement variable comprises:
calculating Euclidean distances between variables;
the two variables with Euclidean distance smaller than the threshold value are mutual replacement variables.
3. The method of claim 1, wherein the supplementing the data of the model feature value of the borrower if the model feature value lacks data comprises:
if the model characteristic values of the borrowers lack data, calculating the mean value or the median value of the model characteristic values of all the borrowers;
and completing the model characteristic value of the missing data of the borrower according to the calculated mean value or the calculated median value.
4. The method of claim 1, further comprising: acquiring external statistical data;
if the model characteristic value of the borrower lacks data, the data of the model characteristic value is supplemented, and the method comprises the following steps:
and if the model characteristic value of the borrower lacks data, supplementing the model characteristic value of the borrower lacking data according to the external statistical data.
5. The method of claim 1, prior to performing logistic regression, further comprising:
calculating the information value of each variable;
checking according to a preset value threshold value, and judging whether the variable is effective or not;
no logistic regression was involved for the invalid variables.
6. A data processing apparatus for optimizing a credit evaluation model, comprising:
the data acquisition module is used for acquiring the related information of the borrower as sample data;
the sample dividing module is used for dividing the sample data into a training set and a test set;
the model training module is used for carrying out data modeling by utilizing the training set to obtain a preliminary evaluation model;
the model testing module is used for testing the preliminary evaluation model by utilizing the test set; if the test result does not meet the evaluation standard, the training set and the test set are divided again, and the training of the divided training set and the test set is utilized to carry out data modeling and testing; if the test result meets the evaluation standard, finishing the training and determining a final evaluation model;
the model training module specifically comprises:
the first classification module is used for performing segmentation processing on the continuous variables in the training set by adopting a decision tree algorithm and converting the continuous variables into discrete variables;
the second classification module is used for classifying the discrete variables in the training set by adopting a clustering algorithm;
the variable merging module is used for merging the variables according to the classification result and determining a preliminary model characteristic value;
and the logistic regression module is used for carrying out logistic regression on the sample data of the model characteristic value to establish a preliminary evaluation model.
7. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method of one of claims 1 to 5.
CN201710785991.2A 2017-09-04 2017-09-04 Data processing method and device for optimizing credit evaluation model Active CN107633265B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710785991.2A CN107633265B (en) 2017-09-04 2017-09-04 Data processing method and device for optimizing credit evaluation model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710785991.2A CN107633265B (en) 2017-09-04 2017-09-04 Data processing method and device for optimizing credit evaluation model

Publications (2)

Publication Number Publication Date
CN107633265A CN107633265A (en) 2018-01-26
CN107633265B true CN107633265B (en) 2021-03-30

Family

ID=61101009

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710785991.2A Active CN107633265B (en) 2017-09-04 2017-09-04 Data processing method and device for optimizing credit evaluation model

Country Status (1)

Country Link
CN (1) CN107633265B (en)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110322334A (en) * 2018-03-29 2019-10-11 上海麦子资产管理集团有限公司 Credit rating method and device, computer readable storage medium, terminal
CN110472802B (en) * 2018-05-09 2023-12-01 创新先进技术有限公司 Data characteristic evaluation method, device and equipment
CN108960505A (en) * 2018-05-31 2018-12-07 试金石信用服务有限公司 Quantitative estimation method, device, system and the storage medium of personal finance credit
WO2019227415A1 (en) * 2018-05-31 2019-12-05 重庆小雨点小额贷款有限公司 Scorecard model adjustment method, device, server and storage medium
CN108711103A (en) * 2018-06-04 2018-10-26 中国平安人寿保险股份有限公司 Personal loan repays Risk Forecast Method, device, computer equipment and medium
CN108985341A (en) * 2018-06-26 2018-12-11 四川斐讯信息技术有限公司 A kind of the training set appraisal procedure and system of neural network model
CN108898479B (en) * 2018-06-28 2021-12-03 中国农业银行股份有限公司 Credit evaluation model construction method and device
CN109325020A (en) * 2018-08-20 2019-02-12 中国平安人寿保险股份有限公司 Small sample application method, device, computer equipment and storage medium
CN109389030B (en) * 2018-08-23 2022-11-29 平安科技(深圳)有限公司 Face characteristic point detection method and device, computer equipment and storage medium
CN110909970A (en) * 2018-09-17 2020-03-24 北京京东金融科技控股有限公司 Credit scoring method and device
CN109408583B (en) * 2018-09-25 2023-04-07 平安科技(深圳)有限公司 Data processing method and device, computer readable storage medium and electronic equipment
CN109409672A (en) * 2018-09-25 2019-03-01 深圳市元征科技股份有限公司 A kind of auto repair technician classifies grading modeling method and device
CN109480864A (en) * 2018-10-26 2019-03-19 首都医科大学附属北京安定医院 A kind of schizophrenia automatic evaluation system based on nervous functional defects and machine learning
CN109583590B (en) * 2018-11-29 2020-11-13 深圳和而泰数据资源与云技术有限公司 Data processing method and data processing device
CN110162995B (en) * 2019-04-22 2023-01-10 创新先进技术有限公司 Method and device for evaluating data contribution degree
CN110363077A (en) * 2019-06-05 2019-10-22 平安科技(深圳)有限公司 Sign Language Recognition Method, device, computer installation and storage medium
CN110458383B (en) * 2019-06-24 2020-08-18 平安国际智慧城市科技股份有限公司 Method and device for realizing demand processing servitization, computer equipment and storage medium
CN110348722A (en) * 2019-07-01 2019-10-18 百维金科(上海)信息科技有限公司 A kind of internet finance air control model based on XGBoost
CN110910002B (en) * 2019-11-15 2023-07-28 安徽海汇金融投资集团有限公司 Account receivables default risk identification method and system
CN111047542B (en) * 2019-12-31 2021-04-27 成都奥伦达科技有限公司 Power line point supplementing method
CN111724374B (en) * 2020-06-22 2024-03-01 智眸医疗(深圳)有限公司 Evaluation method and terminal of analysis result
CN111949640A (en) * 2020-08-04 2020-11-17 上海微亿智造科技有限公司 Intelligent parameter adjusting method and system based on industrial big data
CN112085595A (en) * 2020-09-27 2020-12-15 中国建设银行股份有限公司 Credit scoring model monitoring method and device
CN112258312A (en) * 2020-10-16 2021-01-22 银联商务股份有限公司 Personal credit scoring method and system, electronic device and storage medium
CN112613157A (en) * 2020-11-26 2021-04-06 北京航天智造科技发展有限公司 Rotor fault analysis method and device
CN112580252A (en) * 2020-11-26 2021-03-30 北京航天智造科技发展有限公司 Rotor drop-out fault diagnosis and analysis method and device
CN112365186A (en) * 2020-11-27 2021-02-12 中国电建集团海外投资有限公司 Health degree evaluation method and system for electric power information system
CN112365104A (en) * 2020-12-07 2021-02-12 杭州师范大学 Marital matching method for predicting maximum marital satisfaction
CN112700280A (en) * 2020-12-31 2021-04-23 上海竞动科技有限公司 Short-term discontinuous user behavior evaluation method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101145171A (en) * 2007-09-15 2008-03-19 中国科学院合肥物质科学研究院 Gene microarray data predication method based on independent component integrated study
CN101944122A (en) * 2010-09-17 2011-01-12 浙江工商大学 Incremental learning-fused support vector machine multi-class classification method
CN104574220A (en) * 2015-01-30 2015-04-29 国家电网公司 Power customer credit assessment method based on least square support vector machine
CN104820716A (en) * 2015-05-21 2015-08-05 中国人民解放军海军工程大学 Equipment reliability evaluation method based on data mining
CN105354210A (en) * 2015-09-23 2016-02-24 深圳市爱贝信息技术有限公司 Mobile game payment account behavior data processing method and apparatus
CN106204246A (en) * 2016-08-18 2016-12-07 易联众信息技术股份有限公司 A kind of BP neutral net credit estimation method based on PCA
CN106296389A (en) * 2016-07-28 2017-01-04 联动优势科技有限公司 The appraisal procedure of a kind of user credit degree and device
CN106919706A (en) * 2017-03-10 2017-07-04 广州视源电子科技股份有限公司 The method and device that data update

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070198401A1 (en) * 2006-01-18 2007-08-23 Reto Kunz System and method for automatic evaluation of credit requests

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101145171A (en) * 2007-09-15 2008-03-19 中国科学院合肥物质科学研究院 Gene microarray data predication method based on independent component integrated study
CN101944122A (en) * 2010-09-17 2011-01-12 浙江工商大学 Incremental learning-fused support vector machine multi-class classification method
CN104574220A (en) * 2015-01-30 2015-04-29 国家电网公司 Power customer credit assessment method based on least square support vector machine
CN104820716A (en) * 2015-05-21 2015-08-05 中国人民解放军海军工程大学 Equipment reliability evaluation method based on data mining
CN105354210A (en) * 2015-09-23 2016-02-24 深圳市爱贝信息技术有限公司 Mobile game payment account behavior data processing method and apparatus
CN106296389A (en) * 2016-07-28 2017-01-04 联动优势科技有限公司 The appraisal procedure of a kind of user credit degree and device
CN106204246A (en) * 2016-08-18 2016-12-07 易联众信息技术股份有限公司 A kind of BP neutral net credit estimation method based on PCA
CN106919706A (en) * 2017-03-10 2017-07-04 广州视源电子科技股份有限公司 The method and device that data update

Also Published As

Publication number Publication date
CN107633265A (en) 2018-01-26

Similar Documents

Publication Publication Date Title
CN107633265B (en) Data processing method and device for optimizing credit evaluation model
García et al. An insight into the experimental design for credit risk and corporate bankruptcy prediction systems
Song et al. Predicting software project effort: A grey relational analysis based method
CN107633030B (en) Credit evaluation method and device based on data model
Bowen et al. Generalized SHAP: Generating multiple types of explanations in machine learning
WO2022199185A1 (en) User operation inspection method and program product
CN110738527A (en) feature importance ranking method, device, equipment and storage medium
CN113449204B (en) Social event classification method and device based on local aggregation graph attention network
Murugan Large-scale data-driven financial risk management & analysis using machine learning strategies
CN111767192A (en) Service data detection method, device, equipment and medium based on artificial intelligence
Karimi-Haghighi et al. Predicting early dropout: Calibration and algorithmic fairness considerations
CN111221873A (en) Inter-enterprise homonym identification method and system based on associated network
Zhu et al. Explainable prediction of loan default based on machine learning models
Asmono et al. Absolute correlation weighted naïve bayes for software defect prediction
CN112329862A (en) Decision tree-based anti-money laundering method and system
CN114493853A (en) Credit rating evaluation method, credit rating evaluation device, electronic device and storage medium
Yang et al. An Evidential Reasoning Rule-Based Ensemble Learning Approach for Evaluating Credit Risks with Customer Heterogeneity
CN114170000A (en) Credit card user risk category identification method, device, computer equipment and medium
CN114118526A (en) Enterprise risk prediction method, device, equipment and storage medium
Waller et al. Bias Mitigation Methods for Binary Classification Decision-Making Systems: Survey and Recommendations
Pombinho et al. Errors of Identifiers in Anonymous Databases: Impact on Data Quality
Lee et al. Application of machine learning in credit risk scorecard
CN113822309B (en) User classification method, apparatus and non-volatile computer readable storage medium
CN113806452B (en) Information processing method, information processing device, electronic equipment and storage medium
CN113868438B (en) Information reliability calibration method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: 518000 2203/2204, Building 1, Huide Building, Beizhan Community, Minzhi Street, Longhua District, Shenzhen, Guangdong

Patentee after: SHENZHEN AUDAQUE DATA TECHNOLOGY Ltd.

Address before: 518000 units J and K, 12 / F, block B, building 7, Baoneng Science Park, Qinghu Industrial Zone, Qingxiang Road, Longhua New District, Shenzhen City, Guangdong Province

Patentee before: SHENZHEN AUDAQUE DATA TECHNOLOGY Ltd.