CN111311400A - Modeling method and system of grading card model based on GBDT algorithm - Google Patents

Modeling method and system of grading card model based on GBDT algorithm Download PDF

Info

Publication number
CN111311400A
CN111311400A CN202010236404.6A CN202010236404A CN111311400A CN 111311400 A CN111311400 A CN 111311400A CN 202010236404 A CN202010236404 A CN 202010236404A CN 111311400 A CN111311400 A CN 111311400A
Authority
CN
China
Prior art keywords
model
data
modeling
gbdt algorithm
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010236404.6A
Other languages
Chinese (zh)
Inventor
江远强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baiweijinke Shanghai Information Technology Co ltd
Original Assignee
Baiweijinke Shanghai Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baiweijinke Shanghai Information Technology Co ltd filed Critical Baiweijinke Shanghai Information Technology Co ltd
Priority to CN202010236404.6A priority Critical patent/CN111311400A/en
Publication of CN111311400A publication Critical patent/CN111311400A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis

Abstract

The invention provides a modeling method and a modeling system of a scoring card model based on a GBDT algorithm, wherein the modeling method of the scoring card model based on the GBDT algorithm comprises the steps of selecting and defining a modeling sample, acquiring data characteristics, preprocessing data, performing characteristic engineering, developing the GBDT algorithm, adjusting and evaluating the model, deploying and monitoring the model, dividing the data processed by the characteristic engineering into a test set and a training set at random according to a proportion or according to application date and time, putting the training set into the GBDT algorithm model for training, and verifying model indexes of the trained model by using the test set. According to the technical scheme, the traditional credit rating card construction process is combined with the GBDT algorithm, after characteristic engineering is carried out, the GBDT algorithm is used for replacing the logistic regression algorithm to train, adjust parameters and evaluate model data, novel internet multi-dimensional data with high abnormal values and missing values can be processed in an optimized mode, robustness, universality and accuracy of the model are improved, and the method is more suitable for the requirements of current big data wind control.

Description

Modeling method and system of grading card model based on GBDT algorithm
Technical Field
The invention relates to the technical field of internet financial wind control, in particular to a modeling method and a modeling system of a grading card model based on a GBDT algorithm.
Background
With the rise of the internet, internet financial companies such as P2P loan, consumer finance, automobile leasing and the like present the situation of hundreds of buzzes and flowers, but the fraud black industry chain also starts to continuously permeate the emerging field, and only if the fraud black industry chain has good wind control technology, the fraud black industry chain can be healthily developed in the wave. The internet financial wind control technology mainly uses a traditional credit rating card, and uses a risk prediction model established by a foreign FICO mature and formed logistic regression algorithm for reference, and the credit rating card algorithm predicts the repayment capacity and the willingness of a borrower through dimensional evaluation on the identity status, the occupation characteristics, the collection and payment conditions, the credit investigation history and the like of the borrower. The traditional financial wind control scoring card model based on the logistic regression algorithm has the advantages of being strong in interpretability, simple and easy to understand, capable of directly seeing the weight of each feature, capable of easily absorbing new data to update the model, and widely applied to the field of credit risk control. However, with the development of big data, the logistic regression algorithm has obvious limitation on novel internet data, and the specific problems and difficulties are as follows:
(1) the data preprocessing is complicated: the data form is diversified, the data has the characteristics of non-structurization, low saturation, sparse data and the like, the complexity is greatly improved, and the manual data processing characteristic is time-consuming, labor-consuming and low in efficiency;
(2) the characteristic engineering difficulty is as follows: after the modeling variable is derived and processed, the dimensionality generally reaches thousands of dimensionalities or even tens of thousands of dimensionalities, which far exceeds the data processing capacity range of the traditional wind control modeling scoring card system based on logistic regression, and further machine learning algorithm processing is urgently needed;
(3) insufficient stability of the model: a single model based on a logistic regression algorithm belongs to a weak classifier, and the problems of insufficient stability and weak generalization capability possibly exist.
The GBDT model is a strong predictive model obtained by combining a plurality of weak learners (typically decision trees), and the GBDT algorithm is a typical ensemble learning algorithm. In the GBDT algorithm process, more than two decision trees are trained in sequence by using labeled samples, and then the trained decision trees are integrated into a model to serve as a training result. During training, the loss function of the decision tree generated by the GBDT through multiple iterations is reduced in the gradient direction, along with continuous iteration and gradient descending, the residuals of all previous tree conclusions and residuals of all previous tree conclusions are in each piece of mathematics, the residuals of the decision tree trained in sequence are smaller and smaller, and when the training residuals are small enough or smaller than a set threshold value, namely the fitting effect of the model parameters on the labeled values of the labeled samples reaches the standard, the training can be finished. Compared with a traditional commonly-used credit evaluation support vector machine and a logistic regression algorithm, the GBDT algorithm has better stability and universality, does not need to perform complex feature transformation, can flexibly process various types of data of continuous values, discrete values and mixed type features, uses some robust loss functions, can effectively and automatically select and process abnormal points, is insensitive to feature missing values and abnormal values, enhances the robustness of a model, and has higher model prediction accuracy and better prediction efficiency and effect under the condition of relatively less parameter adjustment.
Disclosure of Invention
In order to apply a more advanced machine learning algorithm GBDT to a traditional model in the actual Internet financial field, fuse a traditional mature modeling process with an advanced algorithm and solve the problems of complex multi-dimensional data processing and insufficient stability of a traditional scoring card model, the invention discloses a modeling method and a system of the scoring card model based on the GBDT algorithm, and the technical scheme of the invention is implemented as follows:
the modeling method of the grading card model based on the GBDT algorithm comprises the following steps:
the method comprises the following steps: the selection definition of the modeling sample firstly defines the positive and negative of the sample according to the product service, and then extracts the modeling sample to exclude special customers;
step two: acquiring data characteristics, namely acquiring characteristic data from the modeling sample in the step one, wherein the characteristic data comprises third-party data acquired by people's bank credit investigation, professional characteristics, income and expenditure conditions, bank flow, professional characteristics, identity status and customer authorization, and acquiring model initial data;
step three: data preprocessing, namely dividing the initial data acquired in the step two into continuous data and discrete data, respectively performing statistical analysis on the continuous data and the discrete data, cleaning the polluted data in the data, checking the consistency of the data, and processing invalid values and missing values;
step four: performing characteristic engineering, namely performing box separation treatment on the data preprocessed in the third step, continuing box separation optimization according to the evidence weight WOE of each box to finally obtain the better information value IV of the variable, selecting the variable entering the model according to the information value IV and KS, and replacing the variable of the conversion model with the evidence weight WOE to generate modeling data;
step five: the GBDT algorithm is developed, the modeling data obtained in the fourth step is divided into a training set and a test set according to proportion, randomly or in a time-span mode, the training set is put into the GBDT algorithm model for training, and the test set is used for verifying the evaluation index of the trained model;
step six: performing model parameter adjustment evaluation, performing parameter adjustment processing on the model, evaluating the distinguishing capability, the predicting capability and the stability of the model according to the evaluation indexes of the model, generating an evaluation report, and comparing the evaluation report with other algorithm models to obtain a conclusion whether the model can be used or not;
step seven: and (3) model deployment and monitoring, which comprises the steps of selecting a system platform for model deployment, deploying the model to the system platform, monitoring the information value IV, the mean value, the PSI and the AUC of the model, and updating the model periodically according to monitoring.
Further, the "contaminated" data in the third step includes redundant data, single level data, sparse data, missing incomplete data.
Further, the evaluation index includes KS, ROC, AUC, PSI.
Further, a scoring card model system based on GBDT algorithm comprises:
the information acquisition module: the modeling device is used for acquiring characteristic data of a modeling sample to obtain initial data of a model;
a data preprocessing module: dividing initial data into continuous data and discrete data, respectively performing statistical analysis on the continuous data and the discrete data, cleaning polluted data in the data according to the statistical analysis, and respectively performing invalid value processing and missing value processing on the continuous data and the discrete data;
characteristic engineering: transforming discrete data characteristics into binary/dummy codes; performing optimal binning processing on continuous data through an optimal binning strategy combining equal-frequency binning, equal-width binning and chi-square binning, calculating a WOE value of each group of bins by using an evidence weight algorithm, then calculating an information value IV of each variable based on the WOE value, sorting the variables with the best prediction capability based on the information value IV, and replacing and converting each bin by the WOE value;
GBDT algorithm development module: dividing the data processed by the characteristic engineering into a test set and a training set according to the proportion at random or according to the application date and time, putting the training set into a GBDT algorithm model for training, and verifying the model index of the trained model by using the test set;
GBDT algorithm parameter adjusting module: performing multi-round parameter adjustment on the model, evaluating the distinguishing capability, the predicting capability and the stability of the model according to the evaluation index of the model, and finally selecting a group of most suitable parameter configuration, wherein the adjustable parameters comprise maximum iteration times, a learning rate, sub-sampling, a loss function, a maximum feature number, a maximum depth, a leaf node minimum sample number, a minimum sample number required by internal node subdivision, a leaf node minimum sample number and a node division minimum impure degree;
a model deployment module: after the model is trained repeatedly by a parameter adjusting and five-fold cross validation method to achieve a stable ideal result, the model is deployed to a system platform, the information values IV, the mean value, the PSI and the AUC of the model are monitored, and the model is updated regularly according to monitoring.
According to the technical scheme, the traditional credit rating card construction process is combined with the GBDT algorithm, after characteristic engineering is carried out, the GBDT algorithm is used for replacing the logistic regression algorithm to train, adjust parameters and evaluate model data, novel internet multi-dimensional data with high abnormal values and missing values can be processed in an optimized mode, robustness, universality and accuracy of the model are improved, and the method is more suitable for the requirements of current big data wind control.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only one embodiment of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The modeling method of the grading card model based on the GBDT algorithm is shown in the combined figure 1 and comprises the following steps: the method comprises the following steps: the selection definition of the modeling sample firstly defines the positive and negative of the sample according to the product service, and then extracts the modeling sample to exclude special customers; step two: acquiring data characteristics, namely acquiring characteristic data from the modeling sample in the step one, wherein the characteristic data comprises third-party data acquired by people's bank credit investigation, professional characteristics, income and expenditure conditions, bank flow, professional characteristics, identity status and customer authorization, and acquiring model initial data; step three: data preprocessing, namely dividing the initial data acquired in the step two into continuous data and discrete data, respectively performing statistical analysis on the continuous data and the discrete data, cleaning the polluted data in the data, checking the consistency of the data, and processing invalid values and missing values; step four: performing characteristic engineering, namely performing box separation treatment on the data preprocessed in the third step, continuing box separation optimization according to the evidence weight WOE of each box to finally obtain the better information value IV of the variable, selecting the variable entering the model according to the information value IV and KS, and replacing the variable of the conversion model with the evidence weight WOE to generate modeling data; step five: the GBDT algorithm is developed, the modeling data obtained in the fourth step is randomly divided into a training set and a test set according to a proportion, the training set is put into a GBDT algorithm model for training, and the test set is used for verifying the evaluation index of the trained model; step six: performing model parameter adjustment evaluation, performing parameter adjustment processing on the model, evaluating the distinguishing capability, the predicting capability and the stability of the model according to the evaluation indexes of the model, generating an evaluation report, and comparing the evaluation report with other algorithm models to obtain a conclusion whether the model can be used or not; step seven: and (3) model deployment and monitoring, which comprises the steps of selecting a system platform for model deployment, deploying the model to the system platform, monitoring the information value IV, the mean value, the PSI and the AUC of the model, and updating the model periodically according to monitoring.
In this embodiment, the scoring card model system based on the GBDT algorithm includes: the information acquisition module: the modeling device is used for acquiring characteristic data of a modeling sample to obtain initial data of a model; a data preprocessing module: dividing initial data into continuous data and discrete data, respectively performing statistical analysis on the continuous data and the discrete data, cleaning polluted data in the data according to the statistical analysis, and respectively performing invalid value processing and missing value processing on the continuous data and the discrete data; characteristic engineering: transforming discrete data characteristics into binary/dummy codes; performing optimal binning processing on continuous data through an optimal binning strategy combining equal-frequency binning, equal-width binning and chi-square binning, calculating a WOE value of each group of bins by using an evidence weight algorithm, then calculating an information value IV of each variable based on the WOE value, sorting the variables with the best prediction capability based on the information value IV, and replacing and converting each bin by the WOE value; GBDT algorithm development module: randomly dividing the data processed by the characteristic engineering into a test set and a training set according to a proportion, putting the training set into a GBDT algorithm model for training, and verifying model indexes of the trained model by using the test set; GBDT algorithm parameter adjusting module: performing multi-round parameter adjustment on the model, evaluating the distinguishing capability, the predicting capability and the stability of the model according to the evaluation index of the model, and finally selecting a group of most suitable parameter configuration, wherein the adjustable parameters comprise maximum iteration times, a learning rate, sub-sampling, a loss function, a maximum feature number, a maximum depth, a leaf node minimum sample number, a minimum sample number required by internal node subdivision, a leaf node minimum sample number and a node division minimum impure degree; a model deployment module: after the model is trained repeatedly by a parameter adjusting and five-fold cross validation method to achieve a stable ideal result, the model is deployed to a system platform, the information values IV, the mean value, the PSI and the AUC of the model are monitored, and the model is updated regularly according to monitoring.
In this embodiment, the "contaminated" data in step three includes redundant data, single level data, sparse data, missing incomplete data; the evaluation index comprises KS, ROC, AUC and PSI.
According to the implementation mode, the traditional construction process of the credit rating card is combined with the GBDT algorithm, after characteristic engineering is carried out, the GBDT algorithm is used for replacing the logistic regression algorithm to train, adjust parameters and evaluate model data, novel internet multi-dimensional data with high abnormal values and missing values can be processed in an optimized mode, the robustness, universality and accuracy of the model are improved, and the method is more suitable for the requirements of current big data wind control.
It should be understood that the above-described embodiments are merely exemplary of the present invention, and are not intended to limit the present invention, and that any modification, equivalent replacement, or improvement made without departing from the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (4)

1. The modeling method of the grading card model based on the GBDT algorithm is characterized by comprising the following steps:
the method comprises the following steps: the selection definition of the modeling sample firstly defines the positive and negative of the sample according to the product service, and then extracts the modeling sample to exclude special customers;
step two: acquiring data characteristics, namely acquiring characteristic data from the modeling sample in the step one, wherein the characteristic data comprises third-party data acquired by people's bank credit investigation, professional characteristics, income and expenditure conditions, bank flow, professional characteristics, identity status and customer authorization, and acquiring model initial data;
step three: data preprocessing, namely dividing the initial data acquired in the step two into continuous data and discrete data, respectively performing statistical analysis on the continuous data and the discrete data, cleaning the polluted data in the data, checking the consistency of the data, and processing invalid values and missing values;
step four: performing characteristic engineering, namely performing box separation treatment on the data preprocessed in the third step, continuing box separation optimization according to the evidence weight WOE of each box to finally obtain the better information value IV of the variable, selecting the variable entering the model according to the information value IV and KS, and replacing the variable of the conversion model with the evidence weight WOE to generate modeling data;
step five: the GBDT algorithm is developed, the modeling data obtained in the fourth step is divided into a training set and a test set according to proportion, randomly or in a time-span mode, the training set is put into the GBDT algorithm model for training, and the test set is used for verifying the evaluation index of the trained model;
step six: performing model parameter adjustment evaluation, performing parameter adjustment processing on the model, evaluating the distinguishing capability, the predicting capability and the stability of the model according to the evaluation indexes of the model, generating an evaluation report, and comparing the evaluation report with other algorithm models to obtain a conclusion whether the model can be used or not;
step seven: and (3) model deployment and monitoring, which comprises the steps of selecting a system platform for model deployment, deploying the model to the system platform, monitoring the information value IV, the mean value, the PSI and the AUC of the model, and updating the model periodically according to monitoring.
2. The modeling method of GBDT algorithm based scoring card model according to claim 1, characterized in that the "contaminated" data in the third step comprises redundant data, single level data, sparse data, missing incomplete data.
3. The modeling method of GBDT algorithm based scoring card model according to claim 1, characterized in that the evaluation metrics include KS, ROC, AUC, PSI.
4. A grading card model system based on GBDT algorithm is characterized by comprising the following components:
the information acquisition module: the modeling device is used for acquiring characteristic data of a modeling sample to obtain initial data of a model;
a data preprocessing module: dividing initial data into continuous data and discrete data, respectively performing statistical analysis on the continuous data and the discrete data, cleaning polluted data in the data according to the statistical analysis, and respectively performing invalid value processing and missing value processing on the continuous data and the discrete data;
characteristic engineering: transforming discrete data characteristics into binary/dummy codes; performing optimal binning processing on continuous data through an optimal binning strategy combining equal-frequency binning, equal-width binning and chi-square binning, calculating a WOE value of each group of bins by using an evidence weight algorithm, then calculating an information value IV of each variable based on the WOE value, sorting the variables with the best prediction capability based on the information value IV, and replacing and converting each bin by the WOE value;
GBDT algorithm development module: dividing the data processed by the characteristic engineering into a test set and a training set according to the proportion at random or according to the application date and time, putting the training set into a GBDT algorithm model for training, and verifying the model index of the trained model by using the test set;
GBDT algorithm parameter adjusting module: performing multi-round parameter adjustment on the model, evaluating the distinguishing capability, the predicting capability and the stability of the model according to the evaluation index of the model, and finally selecting a group of most suitable parameter configuration, wherein the adjustable parameters comprise maximum iteration times, a learning rate, sub-sampling, a loss function, a maximum feature number, a maximum depth, a leaf node minimum sample number, a minimum sample number required by internal node subdivision, a leaf node minimum sample number and a node division minimum impure degree;
a model deployment module: after the model is trained repeatedly by a parameter adjusting and five-fold cross validation method to achieve a stable ideal result, the model is deployed to a system platform, the information values IV, the mean value, the PSI and the AUC of the model are monitored, and the model is updated regularly according to monitoring.
CN202010236404.6A 2020-03-30 2020-03-30 Modeling method and system of grading card model based on GBDT algorithm Withdrawn CN111311400A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010236404.6A CN111311400A (en) 2020-03-30 2020-03-30 Modeling method and system of grading card model based on GBDT algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010236404.6A CN111311400A (en) 2020-03-30 2020-03-30 Modeling method and system of grading card model based on GBDT algorithm

Publications (1)

Publication Number Publication Date
CN111311400A true CN111311400A (en) 2020-06-19

Family

ID=71162433

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010236404.6A Withdrawn CN111311400A (en) 2020-03-30 2020-03-30 Modeling method and system of grading card model based on GBDT algorithm

Country Status (1)

Country Link
CN (1) CN111311400A (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753995A (en) * 2020-06-23 2020-10-09 华东师范大学 Local interpretable method based on gradient lifting tree
CN111861253A (en) * 2020-07-29 2020-10-30 北京车薄荷科技有限公司 Personnel capacity determining method and system
CN111861705A (en) * 2020-07-10 2020-10-30 深圳无域科技技术有限公司 Financial wind control logistic regression feature screening method and system
CN111898675A (en) * 2020-07-30 2020-11-06 北京云从科技有限公司 Credit wind control model generation method and device, scoring card generation method, machine readable medium and equipment
CN111950624A (en) * 2020-08-10 2020-11-17 中国平安人寿保险股份有限公司 Client risk assessment model construction method and device, storage medium and terminal equipment
CN112102074A (en) * 2020-10-14 2020-12-18 深圳前海弘犀智能科技有限公司 Grading card modeling method
CN112116454A (en) * 2020-09-28 2020-12-22 中国建设银行股份有限公司 Credit evaluation method and device
CN112184412A (en) * 2020-09-22 2021-01-05 中国建设银行股份有限公司 Modeling method, device, medium and electronic equipment of credit rating card model
CN112232944A (en) * 2020-09-29 2021-01-15 中诚信征信有限公司 Scoring card creating method and device and electronic equipment
CN112287991A (en) * 2020-10-26 2021-01-29 上海数鸣人工智能科技有限公司 Dpi feature selection method based on L1-regularized logistic regression and GBDT
CN112734568A (en) * 2021-01-29 2021-04-30 深圳前海微众银行股份有限公司 Credit scoring card model construction method, device, equipment and readable storage medium
CN112749921A (en) * 2021-02-01 2021-05-04 深圳无域科技技术有限公司 Mathematical modeling method, system, device and computer readable medium
CN112785415A (en) * 2021-01-20 2021-05-11 深圳前海微众银行股份有限公司 Scoring card model construction method, device, equipment and computer readable storage medium
CN113010493A (en) * 2021-03-16 2021-06-22 北京云从科技有限公司 Data quality online analysis method and device, machine readable medium and equipment
CN113064883A (en) * 2020-09-28 2021-07-02 开鑫金服(南京)信息服务有限公司 Method for constructing logistics wind control model, computer equipment and storage medium
CN113177643A (en) * 2021-05-24 2021-07-27 北京融七牛信息技术有限公司 Automatic modeling system based on big data
CN113177642A (en) * 2021-05-24 2021-07-27 北京融七牛信息技术有限公司 Automatic modeling system for data imbalance
CN113205403A (en) * 2021-03-30 2021-08-03 北京中交兴路信息科技有限公司 Method and device for calculating enterprise credit level, storage medium and terminal
CN113344700A (en) * 2021-07-27 2021-09-03 上海华瑞银行股份有限公司 Wind control model construction method and device based on multi-objective optimization and electronic equipment
CN113657481A (en) * 2021-08-13 2021-11-16 上海晓途网络科技有限公司 Model construction system and method
CN115471056A (en) * 2022-08-31 2022-12-13 鼎翰文化股份有限公司 Data transmission method and data transmission system
CN111984637B (en) * 2020-07-06 2023-04-18 苏州研数信息科技有限公司 Missing value processing method and device in data modeling, equipment and storage medium

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753995A (en) * 2020-06-23 2020-10-09 华东师范大学 Local interpretable method based on gradient lifting tree
CN111984637B (en) * 2020-07-06 2023-04-18 苏州研数信息科技有限公司 Missing value processing method and device in data modeling, equipment and storage medium
CN111861705A (en) * 2020-07-10 2020-10-30 深圳无域科技技术有限公司 Financial wind control logistic regression feature screening method and system
CN111861253A (en) * 2020-07-29 2020-10-30 北京车薄荷科技有限公司 Personnel capacity determining method and system
CN111898675B (en) * 2020-07-30 2021-04-23 北京云从科技有限公司 Credit wind control model generation method and device, scoring card generation method, machine readable medium and equipment
CN111898675A (en) * 2020-07-30 2020-11-06 北京云从科技有限公司 Credit wind control model generation method and device, scoring card generation method, machine readable medium and equipment
CN111950624A (en) * 2020-08-10 2020-11-17 中国平安人寿保险股份有限公司 Client risk assessment model construction method and device, storage medium and terminal equipment
CN112184412A (en) * 2020-09-22 2021-01-05 中国建设银行股份有限公司 Modeling method, device, medium and electronic equipment of credit rating card model
CN113064883A (en) * 2020-09-28 2021-07-02 开鑫金服(南京)信息服务有限公司 Method for constructing logistics wind control model, computer equipment and storage medium
CN112116454A (en) * 2020-09-28 2020-12-22 中国建设银行股份有限公司 Credit evaluation method and device
CN112232944A (en) * 2020-09-29 2021-01-15 中诚信征信有限公司 Scoring card creating method and device and electronic equipment
CN112102074A (en) * 2020-10-14 2020-12-18 深圳前海弘犀智能科技有限公司 Grading card modeling method
CN112102074B (en) * 2020-10-14 2024-01-30 深圳前海弘犀智能科技有限公司 Score card modeling method
CN112287991A (en) * 2020-10-26 2021-01-29 上海数鸣人工智能科技有限公司 Dpi feature selection method based on L1-regularized logistic regression and GBDT
CN112287991B (en) * 2020-10-26 2024-05-03 上海数鸣人工智能科技有限公司 Dpi feature selection method based on L1-regularized logistic regression and GBDT
CN112785415A (en) * 2021-01-20 2021-05-11 深圳前海微众银行股份有限公司 Scoring card model construction method, device, equipment and computer readable storage medium
CN112785415B (en) * 2021-01-20 2024-01-12 深圳前海微众银行股份有限公司 Method, device and equipment for constructing scoring card model and computer readable storage medium
CN112734568A (en) * 2021-01-29 2021-04-30 深圳前海微众银行股份有限公司 Credit scoring card model construction method, device, equipment and readable storage medium
CN112734568B (en) * 2021-01-29 2024-01-12 深圳前海微众银行股份有限公司 Credit scoring card model construction method, device, equipment and readable storage medium
CN112749921A (en) * 2021-02-01 2021-05-04 深圳无域科技技术有限公司 Mathematical modeling method, system, device and computer readable medium
CN113010493A (en) * 2021-03-16 2021-06-22 北京云从科技有限公司 Data quality online analysis method and device, machine readable medium and equipment
CN113205403A (en) * 2021-03-30 2021-08-03 北京中交兴路信息科技有限公司 Method and device for calculating enterprise credit level, storage medium and terminal
CN113177642A (en) * 2021-05-24 2021-07-27 北京融七牛信息技术有限公司 Automatic modeling system for data imbalance
CN113177643A (en) * 2021-05-24 2021-07-27 北京融七牛信息技术有限公司 Automatic modeling system based on big data
CN113344700A (en) * 2021-07-27 2021-09-03 上海华瑞银行股份有限公司 Wind control model construction method and device based on multi-objective optimization and electronic equipment
CN113344700B (en) * 2021-07-27 2024-04-09 上海华瑞银行股份有限公司 Multi-objective optimization-based wind control model construction method and device and electronic equipment
CN113657481A (en) * 2021-08-13 2021-11-16 上海晓途网络科技有限公司 Model construction system and method
CN115471056A (en) * 2022-08-31 2022-12-13 鼎翰文化股份有限公司 Data transmission method and data transmission system

Similar Documents

Publication Publication Date Title
CN111311400A (en) Modeling method and system of grading card model based on GBDT algorithm
CN109977028A (en) A kind of Software Defects Predict Methods based on genetic algorithm and random forest
CN112037009A (en) Risk assessment method for consumption credit scene based on random forest algorithm
CN110866819A (en) Automatic credit scoring card generation method based on meta-learning
CN109800875A (en) Chemical industry fault detection method based on particle group optimizing and noise reduction sparse coding machine
CN112039903B (en) Network security situation assessment method based on deep self-coding neural network model
Pandey et al. An analysis of machine learning techniques (J48 & AdaBoost)-for classification
CN111126868B (en) Road traffic accident occurrence risk determination method and system
CN113537807B (en) Intelligent wind control method and equipment for enterprises
CN112085869A (en) Civil aircraft flight safety analysis method based on flight parameter data
CN106682159A (en) Threshold configuration method
CN111079937A (en) Rapid modeling method
CN111986027A (en) Abnormal transaction processing method and device based on artificial intelligence
CN111260490A (en) Rapid claims settlement method and system based on tree model for car insurance
CN111275485A (en) Power grid customer grade division method and system based on big data analysis, computer equipment and storage medium
CN113221442B (en) Method and device for constructing health assessment model of power plant equipment
CN113487241A (en) Method, device, equipment and storage medium for classifying enterprise environment-friendly credit grades
Tsai et al. Data pre-processing by genetic algorithms for bankruptcy prediction
CN114091549A (en) Equipment fault diagnosis method based on deep residual error network
CN113064976A (en) Accident vehicle judgment method based on deep learning algorithm
CN115828161A (en) Automobile fault type prediction method and device based on recurrent neural network
CN115496364A (en) Method and device for identifying heterogeneous enterprises, storage medium and electronic equipment
CN115392710A (en) Wind turbine generator operation decision method and system based on data filtering
CN114626433A (en) Fault prediction and classification method, device and system for intelligent electric energy meter
CN113326971A (en) PCA (principal component analysis) and Adaboost-based tunnel traffic accident duration prediction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20200619

WW01 Invention patent application withdrawn after publication