CN110738564A - Post-loan risk assessment method and device and storage medium - Google Patents

Post-loan risk assessment method and device and storage medium Download PDF

Info

Publication number
CN110738564A
CN110738564A CN201910983490.4A CN201910983490A CN110738564A CN 110738564 A CN110738564 A CN 110738564A CN 201910983490 A CN201910983490 A CN 201910983490A CN 110738564 A CN110738564 A CN 110738564A
Authority
CN
China
Prior art keywords
data
model
features
loan
engineering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910983490.4A
Other languages
Chinese (zh)
Inventor
王晨曦
林路
王慜骊
郏维强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SUNYARD SYSTEM ENGINEERING Co Ltd
Original Assignee
SUNYARD SYSTEM ENGINEERING Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SUNYARD SYSTEM ENGINEERING Co Ltd filed Critical SUNYARD SYSTEM ENGINEERING Co Ltd
Priority to CN201910983490.4A priority Critical patent/CN110738564A/en
Publication of CN110738564A publication Critical patent/CN110738564A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The embodiment of the invention discloses post-loan risk assessment methods and devices and storage media, wherein the method comprises the following steps of preprocessing stored loan data, wherein the preprocessing comprises data dimension reduction, data reclassification, data merging and data cleaning, the classified loan data comprises static data and dynamic data, performing feature engineering on the preprocessed dynamic data to extract corresponding feature engineering features, constructing an assessment model based on a deep learning network, training the assessment model by adopting expert features and feature engineering features in a financial model, performing data assessment based on the assessment model and outputting a final approval list.

Description

Post-loan risk assessment method and device and storage medium
Technical Field
The invention relates to the technical field of post-loan risk management and control of people, in particular to post-loan risk assessment methods and devices and storage media.
Background
With the deepened development of the popular finance, the financial loan market is mature day by day, the demand of small and micro enterprises for loans is larger and larger, and meanwhile, the requirements on the aspects of post-loan management, post-loan early warning, collection hastening and the like are also continuously improved.
At present, in the traditional technology for managing, grading and grading small and micro enterprises in banks after credit and credit, relevant data of client grading, debt factors and financial factors are collected, whether a variable with the strongest default interpretation capability is found through single variable regression and then multivariate combined dual regression, and the number order of the variable is usually 10^ 2; the method mostly uses evidence weight and logistic regression to construct the model. Actually, the financing of the small and micro enterprises at present has the characteristics of short length, small size, frequent, urgent and scattered characteristics and the like, so that the traditional post-credit management mode is difficult to apply, real data is difficult to obtain effectively through a single enterprise, special items, channel and the like, and the financial report data of the small and micro enterprises are easy to distort, so that the risk is difficult to evaluate accurately by the evaluation technology used in the traditional mode, the post-credit work of a customer manager is complicated and unordered, the workload is increased, and the potential post-credit risk cannot be identified effectively.
Disclosure of Invention
The embodiment of the invention provides post-loan risk assessment methods and devices and a storage medium, which can improve the efficiency of post-loan data analysis, improve the accuracy of a risk assessment model, increase the times of default predictable and finally effectively control post-loan risks.
The embodiment of the present invention provides methods for assessing risk after loan, which may include:
preprocessing the stored loan data, wherein the preprocessing comprises data dimension reduction, data reclassification, data combination and data cleaning, and the classified loan data comprises static data and dynamic data;
performing characteristic engineering on the preprocessed dynamic data to extract corresponding characteristic engineering characteristics;
constructing an evaluation model based on a deep learning network, and training the evaluation model by adopting expert characteristics and characteristic engineering characteristics in a financial model;
and evaluating the data based on the evaluation model and outputting a final approval list.
A second aspect of an embodiment of the present invention provides post-loan risk assessment apparatus, which may include:
the data preprocessing module is used for preprocessing the stored loan data, the preprocessing comprises data dimension reduction, data reclassification, data combination and data cleaning, and the loan data comprises static data and dynamic data after being classified;
the characteristic engineering module is used for carrying out characteristic engineering on the preprocessed dynamic data to extract corresponding characteristic engineering characteristics;
the model construction module is used for constructing an evaluation model based on a deep learning network and optimizing the evaluation model by adopting expert characteristics and characteristic engineering characteristics in a financial model;
and the data evaluation module is used for carrying out data evaluation based on the evaluation model and outputting a final approval list.
A third aspect of embodiments of the present invention provides computer apparatuses, which include a processor and a memory, where the memory stores at least instructions, at least programs, code sets, or instruction sets, and the at least instructions, the at least programs, the code sets, or the instruction sets are loaded and executed by the processor to implement the method for assessing risk after lending described in the above aspects.
A fourth aspect of the embodiments of the present invention provides computer storage media storing at least instructions, at least programs, code sets, or instruction sets, wherein the at least instructions, the at least programs, the code sets, or the instruction sets are loaded and executed by a processor to implement the method for assessing risk after lending described in the above aspects.
In the embodiment of the invention, by combining the expert characteristics and the deep learning method in the traditional financial model, the deep characteristics such as time sequence characteristics existing in the credit flow are mined by performing characteristic engineering on dynamic data, the model precision is improved by combining the existing expert model and the characteristics, the greatly prolonged prediction of default predictable time is more accurate, the credit risk and the tracking workload and the working time of an auditor are greatly reduced, a lender can credit funds to more clients, the loan of the lender is fully dispersed, and the risk is controlled, so that the large-scale credit approval of small and micro enterprises is realized, and the efficient, intelligent and rapid credit and risk control are possible.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic flow chart of a method for evaluating risk of types after loan according to an embodiment of the invention;
FIG. 2 is a schematic representation of a KS curve for risk assessment as provided by an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of an post-loan risk assessment apparatus according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a feature engineering module provided by an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a model building module provided in an embodiment of the present invention;
fig. 6 is a schematic structural diagram of computer devices according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only partial embodiments of of the present invention, rather than all embodiments.
The terms "comprising" and "having" and any variations thereof in the description and claims of this invention and the above-described drawings are intended to cover non-exclusive inclusions, and the terms "" and "second" are used merely as a distinguishing designation and do not represent a numerical size or ordering, for example, a process, method, system, article, or apparatus that comprises a series of steps or elements is not limited to the listed steps or elements, but may alternatively include additional steps or elements not listed, or may alternatively include other steps or elements inherent to such process, method, article, or apparatus.
In the embodiment of the present invention, the post-loan risk assessment method may be applied to a Computer device, where the Computer device may be a terminal such as a tablet Computer or a Personal Computer (PC), or may be other electronic devices with computing processing capability.
As shown in fig. 1, the post-loan risk assessment method may include at least the following steps:
and S101, preprocessing the stored loan data.
It is understood that the preprocessing manner may specifically include processes of data dimension reduction, data reclassification, data merging, data cleansing, and the like. The data reclassification is to divide the loan data into static data and dynamic data, and the static data may include the following aspects: 1. the basic information of the client is from a core CIF, and a client information table in a credit system comprises basic information such as a client name, an affiliated industry, an enterprise type, a number of established days, a registered amount, actual income capital, an operating site and the like. 2. The customer behavior characteristics comprise the machine account information extracted from the thin-belt system, and the times and the amount statistics of actions such as credit, acceptance, cash withdrawal, insurance letter and the like of the customer are extracted. And whether to pay for water and electricity, whether to have data such as guarantee behaviors, etc. 3. Customer association relationship: the method comprises the steps of accepting a payee, guaranteeing a contract, performing client transaction flow, and improving the client association relationship which can be extracted from the industrial chains of an upstream client and a downstream client of a client. 4. Client status, information the client is in the loan ledger. Dynamic data is mainly a stream of customer settlements: the method comprises the steps of annotating time-use transaction opponent channels and other information, completing time-based user transaction running data, marking the dynamic data, marking customers with default behaviors as positive samples '1', and marking customers without default behaviors as negative samples '0'.
And aiming at data dimension reduction, clustering is mainly carried out between every two to three dimensions of the loan data, binning and dummy variable processing are carried out after clustering, information gain after dummy variable quantization is calculated and sequenced, and when the information gain exceeds a certain set threshold value, the dummy variable processing is carried out on the loan data.
In the present embodiment, static data is not considered to have redundant data, and the running water of a trading opponent is removed for dynamic data, namely, the static data needs to be removed, namely, the running water of the trading opponent is removed from the running water data.
And S102, performing feature engineering on the preprocessed dynamic data to extract corresponding feature engineering features.
It can be understood that the feature engineering mainly comprises three parts of shallow feature extraction, statistical feature extraction and deep feature extraction, continuous features are segmented and discretized through the shallow feature extraction, the statistical feature extraction is to extract descriptive statistics of time series in a data observation window, wherein the descriptive statistics comprise or more of mean value, standard deviation and partition level difference, the deep feature extraction is to extract depth features of data, and finally the three features can be combined to obtain feature engineering features corresponding to the feature engineering.
It should be noted that, in order to avoid the influence of non-uniform data distribution and extreme values, the device needs to perform segmented discretization on continuous features when performing shallow feature extraction, the discretized data adopts the sun protection of one-hot coding, features are expanded into a plurality of features, and the actual data is filled to be "-1" and then is used as -type discrete data and also participates in one-hot coding.
Preferably, when deep feature extraction is performed, the device may perform depth and time sequence feature extraction by using an LSTM network or the like.
The method comprises the following steps of reading tail data of source data, finding a latest data date, setting a head start value of an observation window, a start value of the observation window and an end value of the observation window to be-1, finding the latest data, setting the observation window to be a preset value, using the latest date as a date for returning to the observation window, obtaining a final early warning, and setting the date of the start of the observation window (the date is mainly used for distinguishing).
S103, building an evaluation model based on the deep learning network, and training the evaluation model by adopting expert characteristics and characteristic engineering characteristics.
Before modeling, the equipment can define the following parameters, specifically including observation window definition, defined according to data conditions, , as well as defined from loan opening to date, namely, measuring and marking 'good'/'bad' of the loan according to behavior performance of loan approval opening to date.
In specific implementation, the equipment can adopt expert characteristics and characteristic engineering characteristics to respectively make the following models: firstly, aiming at dynamic data, an LSTM network and the like are adopted to extract and model depth and time sequence characteristics. The model will be trained using an efficient ADAM optimization algorithm and a mean square error function.
The method comprises the following specific steps:
, the device may define control loop units GRU as a deep learning model framework, and in step , model hyper-parameter selection may be performed by using the standard performance metric of the optimal hyper-parameter search algorithm LightGBM integrated tree model.
And , transforming a feature vector expression corresponding to the feature engineering features, evaluating a default evaluation label for whether the user has a default, and directly performing deep learning model training as a sample.
And , extracting hidden layer characteristics of the neural network, namely extracting the numerical value of the penultimate layer of the neural network as data characteristics, and finally forming sample data for an upper-layer evaluation model.
Preferably, an efficient ADAM optimization algorithm and a mean square error function can be used for training.
During the integration of the model:
the adopted optimal model hyper-parameter search algorithm is specifically a Hyperopt/skopt search algorithm.
And finally, fitting the trained model and outputting data capable of predicting the default probability.
Secondly, the device can adopt a method of a support vector machine to train a model according to static data and output the result
And thirdly, the equipment can train whether the depth features and static data obtained in the step are violated by adopting a decision tree random forest method and output a model result, a Neural Module Network (NMN) technology in a combined model method is adopted for the output quantity belonging to different types of models, the multi-mode data are divided into modules according to functions, the output results are mapped into the same vector spaces and fused, if the final prediction also belongs to a result of two classifications, the results are summed by adopting a fusion method with weights, and meanwhile, a preset threshold value is used for carrying out binarization processing on the weighted and summed value, wherein the value is more than the set value and is 1, and the value is less than or equal to 0.
In an alternative embodiment, the device may model exclude the last-built model, for example, when the model satisfies of two cases (1. loan client delinquent loan number > -2, 2. loan client last 6 months silence).
And S104, evaluating the data based on the evaluation model and outputting a final approval list.
In a specific implementation, the device may randomly scale in the training set and validation set, and may cycle through the above process for risk assessment after new data.
In an alternative embodiment, the device may output a KS (Kolmogorov-Smirnov) graph (as shown in fig. 2) from which the accuracy of the explicit assessment can be more quickly and clearly assessed, where the vertical axis of the KS curve is the distance between the two curves TPR and FPR, the horizontal axis is the threshold range [0,1], and the KS value represents the maximum value of the distance between the two curves. The larger the value is, the larger the degree of distinguishing the positive customer from the negative customer by the model is, and KS >0.2 means that the model has better prediction accuracy.
In the embodiment of the invention, by combining the expert characteristics and the deep learning method in the traditional financial model, the deep characteristics such as time sequence characteristics existing in the credit flow are mined by performing characteristic engineering on dynamic data, the model precision is improved by combining the existing expert model and the characteristics, the greatly prolonged prediction of default predictable time is more accurate, the credit risk and the tracking workload and the working time of an auditor are greatly reduced, a lender can credit funds to more clients, the loan of the lender is fully dispersed, and the risk is controlled, so that the large-scale credit approval of small and micro enterprises is realized, and the efficient, intelligent and rapid credit and risk control are possible.
The following describes the post-loan risk assessment apparatus according to an embodiment of the present invention with reference to fig. 3 to 5. It should be noted that the post-loan risk assessment apparatus shown in fig. 3-5 is used for executing the method of the embodiment of the present invention shown in fig. 1 and 2, and for convenience of illustration, only the portion related to the embodiment of the present invention is shown, and details of the specific technology are not disclosed, please refer to the embodiment of the present invention shown in fig. 1 and 2.
Referring to fig. 3, a schematic structural diagram of post-loan risk assessment devices is provided for an embodiment of the present invention, as shown in fig. 3, a post-loan risk assessment device 10 according to an embodiment of the present invention may include a data preprocessing module 101, a feature engineering module 102, a model construction module 103, a data assessment module 104, and an expert feature processing module 105, where the feature engineering module 102 includes a shallow feature extraction unit 1021, a statistical feature extraction unit 1022, a deep feature extraction unit 1023, and a feature combination unit 1024, as shown in fig. 4, and the model construction module 103 includes a model training unit 1031, a second model training unit 1032, and a model result output unit 1033, as shown in fig. 5.
The data preprocessing module 101 is configured to preprocess the stored loan data, where the preprocessing includes data dimension reduction, data reclassification, data merging and data cleansing, and the loan data after being classified includes static data and dynamic data.
And the feature engineering module 102 is configured to perform feature engineering on the preprocessed dynamic data to extract corresponding feature engineering features.
In an alternative embodiment, the feature engineering module 102 includes:
and a shallow feature extraction unit 1021, configured to perform shallow feature extraction on the preprocessed loan data, and perform segmented discretization on the continuous features to obtain shallow features.
The statistical feature extraction unit 1022 is configured to perform statistical feature extraction on the preprocessed loan data, and extract descriptive statistics of a time series in the data observation window to obtain statistical features, where the descriptive statistics include or more of a mean, a standard deviation, and a variance.
And the deep feature extraction unit 1023 is used for performing deep feature extraction on the preprocessed loan data to obtain corresponding depth features.
And a feature combining unit 1024 for combining the shallow features, the statistical features and the deep features to form feature engineering features corresponding to the feature engineering.
And the model construction module 103 is used for constructing an evaluation model based on a deep learning network and optimizing the evaluation model by adopting expert features and feature engineering features in the financial model.
In an alternative embodiment, model building module 103 includes:
and an model training unit 1031, which extracts depth and time sequence features by using an LSTM network, establishes an evaluation model, and trains whether the established model is violated by using an ADAM optimization algorithm and a mean square error function.
The second model training unit 1032 performs model training based on the static data by using a support vector machine.
A model result output unit 1033, configured to output a model result for the default training by using a decision tree random forest method.
And the data evaluation module 104 is used for carrying out data evaluation based on the evaluation model and outputting a final approval list.
The expert characteristic processing module 105 is used for introducing expert characteristics, and the range of data required by the expert characteristics is defined to have the minimum intersection with the range of data required in the characteristic engineering.
It should be noted that, for the execution process of the module or the unit in this embodiment, reference may be made to the detailed description in the foregoing method embodiment, and details are not described here again.
In the embodiment of the invention, by combining the expert characteristics and the deep learning method in the traditional financial model, the deep characteristics such as time sequence characteristics existing in the credit flow are mined by performing characteristic engineering on dynamic data, the model precision is improved by combining the existing expert model and the characteristics, the greatly prolonged prediction of default predictable time is more accurate, the credit risk and the tracking workload and the working time of an auditor are greatly reduced, a lender can credit funds to more clients, the loan of the lender is fully dispersed, and the risk is controlled, so that the large-scale credit approval of small and micro enterprises is realized, and the efficient, intelligent and rapid credit and risk control are possible.
The embodiment of the present invention further provides computer storage media, where the computer storage media may store a plurality of instructions, and the instructions are suitable for being loaded by a processor and executing the method steps in the embodiments shown in fig. 1 and fig. 2, and a specific execution process may refer to specific descriptions of the embodiments shown in fig. 1 and fig. 2, which are not described herein again.
The embodiment of the present application further provides computer devices, as shown in fig. 6, the computer device 20 may include at least processors 201, such as a CPU, at least network interfaces 204, a user interface 203, a memory 205, at least communication buses 202, and optionally a display 206, wherein the communication buses 202 are used to implement connection communication between these components, wherein the user interface 203 may include a touch screen, a keyboard, a mouse, or the like, the network interface 204 may optionally include a standard wired interface, a wireless interface (such as a WI-FI interface), and a communication connection may be established with a server through the network interface 204, the memory 205 may be a high-speed RAM memory, or a non-volatile memory (such as at least disk memories), the memory 205 includes a flash in the embodiment of the present invention, and optionally at least storage systems located away from the aforementioned processor 201, as shown in fig. 6, the memory 205 as a computer storage medium of 67 may include an operating system, a network communication module, and a user interface module.
It should be noted that the network interface 204 may be connected to a receiver, a transmitter or other communication module, and the other communication module may include, but is not limited to, a WiFi module, a bluetooth module, etc., and it is understood that the computer device in the embodiment of the present invention may also include a receiver, a transmitter, other communication module, etc.
Processor 201 may be used to call program instructions stored in memory 205 and cause computer device 20 to perform the following operations:
preprocessing the stored loan data, wherein the preprocessing comprises data dimension reduction, data reclassification, data combination and data cleaning, and the classified loan data comprises static data and dynamic data;
performing characteristic engineering on the preprocessed dynamic data to extract corresponding characteristic engineering characteristics;
constructing an evaluation model based on a deep learning network, and training the evaluation model by adopting expert characteristics and characteristic engineering characteristics in a financial model;
and evaluating the data based on the evaluation model and outputting a final approval list.
In some of the embodiments, the first,
performing data dimension reduction, namely clustering every two to three dimensions of the loan data, performing box separation and dummy variable processing after clustering, calculating and sequencing information gains after the dummy variable is quantized, and performing dummy variable processing on the information gains after the information gains exceed a certain set threshold;
the data reclassification is to divide the loan data into static data and dynamic data;
data merging is to carry out parallel combination on related table structures by a core client indicated by the primary key words to form an original feature set;
and the data cleaning is to remove the data from the dynamic data.
In , when the device 20 performs feature engineering on the preprocessed loan data to extract corresponding feature engineering features, it is specifically configured to:
performing shallow feature extraction on the preprocessed loan data, and performing segmented discretization on continuous features to obtain shallow features;
extracting statistical characteristics of the preprocessed loan data, and extracting descriptive statistics of a time sequence in a data observation window to obtain statistical characteristics, wherein the descriptive statistics comprise or more of mean values, standard deviations and partition differences;
deep feature extraction is carried out on the preprocessed loan data to obtain corresponding depth features;
and combining the shallow feature, the statistical feature and the deep feature to form a feature engineering feature corresponding to the feature engineering.
In embodiments, apparatus 20 is further configured to:
and introducing expert features in the financial model, and defining the range of data required by the expert features to have minimum intersection with the range of data required in the feature engineering.
In , when the device 20 performs evaluation model construction based on the deep learning network and performs training of an evaluation model by using expert features and feature engineering features, it is specifically configured to:
adopting an LSTM network to extract depth and time sequence characteristics, establishing an evaluation model, and adopting an ADAM optimization algorithm and a mean square error function to train whether the established model is violated;
performing model training by using a support vector machine according to the static data;
and outputting a model result aiming at the default training by adopting a decision tree random forest method.
In the embodiment of the invention, by combining the expert characteristics and the deep learning method in the traditional financial model, the deep characteristics such as time sequence characteristics existing in the credit flow are mined by performing characteristic engineering on dynamic data, the model precision is improved by combining the existing expert model and the characteristics, the greatly prolonged prediction of default predictable time is more accurate, the credit risk and the tracking workload and the working time of an auditor are greatly reduced, a lender can credit funds to more clients, the loan of the lender is fully dispersed, and the risk is controlled, so that the large-scale credit approval of small and micro enterprises is realized, and the efficient, intelligent and rapid credit and risk control are possible.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention, and it is therefore to be understood that the invention is not limited by the scope of the appended claims.

Claims (10)

1, post-loan risk assessment method, comprising:
preprocessing the stored loan data, wherein the preprocessing comprises data dimension reduction, data reclassification, data combination and data cleaning, and the classified loan data comprises static data and dynamic data;
performing characteristic engineering on the preprocessed dynamic data to extract corresponding characteristic engineering characteristics;
constructing an evaluation model based on a deep learning network, and training the evaluation model by adopting expert features and the feature engineering features in a financial model;
and evaluating data based on the evaluation model and outputting a final approval list.
2. The method of claim 1,
performing clustering between every two to three dimensions of the loan data, performing box separation and dummy variable processing after clustering, calculating and sequencing information gains after dummy variable quantization, and performing dummy variable processing on the information gains after the information gains exceed a certain set threshold;
the data reclassification is to divide the loan data into static data and dynamic data;
the data combination is that the core client indicated by the primary key words connects the related table structures in parallel and combines the related table structures to form an original feature set;
and the data cleaning is to remove the data of the dynamic data.
3. The method according to claim 1, wherein the performing feature engineering on the preprocessed loan data to extract corresponding feature engineering features comprises:
performing shallow feature extraction on the preprocessed loan data, and performing segmented discretization on continuous features to obtain shallow features;
extracting statistical characteristics of the preprocessed loan data, and extracting descriptive statistics of a time sequence in a data observation window to obtain statistical characteristics, wherein the descriptive statistics comprise or more of mean value, standard deviation and branch difference;
deep feature extraction is carried out on the preprocessed loan data to obtain corresponding depth features;
and combining the shallow features, the statistical features and the deep features to form feature engineering features corresponding to feature engineering.
4. The method of claim 1, further comprising:
introducing expert features in the financial model, and defining the range of data required by the expert features to have minimum intersection with the range of data required in the feature engineering.
5. The method according to claim 3, wherein the building of the evaluation model based on the deep learning network and the training of the evaluation model by using expert features and feature engineering features comprise:
adopting an LSTM network to extract depth and time sequence characteristics, establishing an evaluation model, and adopting an ADAM optimization algorithm and a mean square error function to train whether the established model is violated;
performing model training by adopting a support vector machine according to the static data;
and outputting a model result aiming at the default training by adopting a decision tree random forest method.
The post-loan risk assessment apparatus of the species , comprising:
the data preprocessing module is used for preprocessing the stored loan data, wherein the preprocessing comprises data dimension reduction, data reclassification, data combination and data cleaning, and the classified loan data comprises static data and dynamic data;
the characteristic engineering module is used for carrying out characteristic engineering on the preprocessed dynamic data to extract corresponding characteristic engineering characteristics;
the model construction module is used for constructing an evaluation model based on a deep learning network and optimizing the evaluation model by adopting expert characteristics and the characteristic engineering characteristics in the financial model;
and the data evaluation module is used for carrying out data evaluation based on the evaluation model and outputting a final approval list.
7. The apparatus of claim 6, wherein the feature engineering module comprises:
the shallow feature extraction unit is used for performing shallow feature extraction on the preprocessed loan data and discretizing continuous features in sections to obtain shallow features;
the statistical feature extraction unit is used for extracting statistical features of the preprocessed loan data and extracting descriptive statistics of a time sequence in a data observation window to obtain statistical features, wherein the descriptive statistics comprise or more of mean values, standard deviations and partition differences;
the deep feature extraction unit is used for carrying out deep feature extraction on the preprocessed loan data to obtain corresponding depth features;
and the characteristic combining unit is used for combining the shallow characteristic, the statistical characteristic and the deep characteristic to form a characteristic engineering characteristic corresponding to the characteristic engineering.
8. The apparatus of claim 6, further comprising:
and the expert characteristic processing module is used for introducing expert characteristics and defining the minimum intersection between the range of the data required by the expert characteristics and the range of the data required in the characteristic engineering.
9. The apparatus of claim 6, wherein the model building module comprises:
an model training unit, which adopts an LSTM network to extract depth and time sequence characteristics, establishes an evaluation model, and adopts an ADAM optimization algorithm and a mean square error function to train whether the established model is violated;
the second model training unit is used for performing model training according to the static data by adopting a support vector machine;
and the model result output unit is used for outputting a model result aiming at the default training by adopting a decision tree random forest method.
10, computer readable storage medium having stored therein at least instructions, at least program segments, a set of codes, or a set of instructions, wherein the at least instructions, the at least program segments, the set of codes, or the set of instructions are loaded and executed by a processor to implement the method of assessing risk after lending as claimed in any of claims 1 to 5.
CN201910983490.4A 2019-10-16 2019-10-16 Post-loan risk assessment method and device and storage medium Pending CN110738564A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910983490.4A CN110738564A (en) 2019-10-16 2019-10-16 Post-loan risk assessment method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910983490.4A CN110738564A (en) 2019-10-16 2019-10-16 Post-loan risk assessment method and device and storage medium

Publications (1)

Publication Number Publication Date
CN110738564A true CN110738564A (en) 2020-01-31

Family

ID=69270060

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910983490.4A Pending CN110738564A (en) 2019-10-16 2019-10-16 Post-loan risk assessment method and device and storage medium

Country Status (1)

Country Link
CN (1) CN110738564A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111353689A (en) * 2020-02-14 2020-06-30 北京贝壳时代网络科技有限公司 Risk assessment method and device
CN111652710A (en) * 2020-06-03 2020-09-11 北京化工大学 Personal credit risk assessment method based on ensemble tree feature extraction and Logistic regression
CN111754202A (en) * 2020-06-29 2020-10-09 深圳前海微众银行股份有限公司 Bill direct sticking method, device, equipment and computer readable storage medium
CN112116454A (en) * 2020-09-28 2020-12-22 中国建设银行股份有限公司 Credit evaluation method and device
CN112419045A (en) * 2020-11-25 2021-02-26 苏州大学 Unbalanced credit user classification method based on oversampling and random forest
CN112801709A (en) * 2021-02-05 2021-05-14 杭州拼便宜网络科技有限公司 User loss prediction method, device, equipment and storage medium
CN112927071A (en) * 2021-04-21 2021-06-08 顶象科技有限公司 Post-loan behavior feature processing method and device
CN113052677A (en) * 2021-03-29 2021-06-29 北京顶象技术有限公司 Method and device for constructing two-stage loan prediction model based on machine learning
CN113052703A (en) * 2021-04-20 2021-06-29 中国工商银行股份有限公司 Transaction risk early warning method and device
WO2021159735A1 (en) * 2020-09-18 2021-08-19 平安科技(深圳)有限公司 Credit risk assessment method and apparatus, and computer device and storage medium
CN113393066A (en) * 2020-03-11 2021-09-14 清华大学 Method and device for generating risk assessment model and risk assessment method and device
CN113421154A (en) * 2021-05-27 2021-09-21 上海交通大学 Credit risk assessment method and system based on control chart
CN113450208A (en) * 2021-06-30 2021-09-28 中国建设银行股份有限公司 Loan risk change early warning and model training method and device
CN113642825A (en) * 2021-05-28 2021-11-12 浙江惠瀜网络科技有限公司 Supervision method suitable for vehicle loan cooperation mechanism
CN113888019A (en) * 2021-10-22 2022-01-04 山东大学 Personnel dynamic risk assessment method and system based on neural network

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897918A (en) * 2017-02-24 2017-06-27 上海易贷网金融信息服务有限公司 A kind of hybrid machine learning credit scoring model construction method
CN107992982A (en) * 2017-12-28 2018-05-04 上海氪信信息技术有限公司 A kind of Default Probability Forecasting Methodology of the unstructured data based on deep learning
CN108154430A (en) * 2017-12-28 2018-06-12 上海氪信信息技术有限公司 A kind of credit scoring construction method based on machine learning and big data technology
CN108256691A (en) * 2018-02-08 2018-07-06 成都智宝大数据科技有限公司 Refund Probabilistic Prediction Model construction method and device
CN108564286A (en) * 2018-04-19 2018-09-21 天合泽泰(厦门)征信服务有限公司 A kind of artificial intelligence finance air control credit assessment method and system based on big data reference
US20180349986A1 (en) * 2017-06-05 2018-12-06 Mo Tecnologias, Llc System and method for issuing a loan to a consumer determined to be creditworthy and with bad debt forecast
CN109190808A (en) * 2018-08-15 2019-01-11 拍拍信数据服务(上海)有限公司 User's behavior prediction method, apparatus, equipment and medium
CN109360084A (en) * 2018-09-27 2019-02-19 平安科技(深圳)有限公司 Appraisal procedure and device, storage medium, the computer equipment of reference default risk

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897918A (en) * 2017-02-24 2017-06-27 上海易贷网金融信息服务有限公司 A kind of hybrid machine learning credit scoring model construction method
US20180349986A1 (en) * 2017-06-05 2018-12-06 Mo Tecnologias, Llc System and method for issuing a loan to a consumer determined to be creditworthy and with bad debt forecast
CN107992982A (en) * 2017-12-28 2018-05-04 上海氪信信息技术有限公司 A kind of Default Probability Forecasting Methodology of the unstructured data based on deep learning
CN108154430A (en) * 2017-12-28 2018-06-12 上海氪信信息技术有限公司 A kind of credit scoring construction method based on machine learning and big data technology
CN108256691A (en) * 2018-02-08 2018-07-06 成都智宝大数据科技有限公司 Refund Probabilistic Prediction Model construction method and device
CN108564286A (en) * 2018-04-19 2018-09-21 天合泽泰(厦门)征信服务有限公司 A kind of artificial intelligence finance air control credit assessment method and system based on big data reference
CN109190808A (en) * 2018-08-15 2019-01-11 拍拍信数据服务(上海)有限公司 User's behavior prediction method, apparatus, equipment and medium
CN109360084A (en) * 2018-09-27 2019-02-19 平安科技(深圳)有限公司 Appraisal procedure and device, storage medium, the computer equipment of reference default risk

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111353689A (en) * 2020-02-14 2020-06-30 北京贝壳时代网络科技有限公司 Risk assessment method and device
CN111353689B (en) * 2020-02-14 2023-10-31 北京贝壳时代网络科技有限公司 Risk assessment method and device
WO2021179907A1 (en) * 2020-03-11 2021-09-16 清华大学 Method and apparatus for generating risk assessment model and risk assessment method and apparatus
CN113393066A (en) * 2020-03-11 2021-09-14 清华大学 Method and device for generating risk assessment model and risk assessment method and device
CN111652710A (en) * 2020-06-03 2020-09-11 北京化工大学 Personal credit risk assessment method based on ensemble tree feature extraction and Logistic regression
CN111652710B (en) * 2020-06-03 2024-01-30 北京化工大学 Personal credit risk assessment method based on integrated tree feature extraction and Logistic regression
CN111754202A (en) * 2020-06-29 2020-10-09 深圳前海微众银行股份有限公司 Bill direct sticking method, device, equipment and computer readable storage medium
CN111754202B (en) * 2020-06-29 2024-05-28 深圳前海微众银行股份有限公司 Bill direct-pasting method, device, equipment and computer readable storage medium
WO2021159735A1 (en) * 2020-09-18 2021-08-19 平安科技(深圳)有限公司 Credit risk assessment method and apparatus, and computer device and storage medium
CN112116454A (en) * 2020-09-28 2020-12-22 中国建设银行股份有限公司 Credit evaluation method and device
CN112419045A (en) * 2020-11-25 2021-02-26 苏州大学 Unbalanced credit user classification method based on oversampling and random forest
CN112801709A (en) * 2021-02-05 2021-05-14 杭州拼便宜网络科技有限公司 User loss prediction method, device, equipment and storage medium
CN113052677A (en) * 2021-03-29 2021-06-29 北京顶象技术有限公司 Method and device for constructing two-stage loan prediction model based on machine learning
CN113052703A (en) * 2021-04-20 2021-06-29 中国工商银行股份有限公司 Transaction risk early warning method and device
CN112927071A (en) * 2021-04-21 2021-06-08 顶象科技有限公司 Post-loan behavior feature processing method and device
CN113421154A (en) * 2021-05-27 2021-09-21 上海交通大学 Credit risk assessment method and system based on control chart
CN113421154B (en) * 2021-05-27 2022-10-04 上海交通大学 Credit risk assessment method and system based on control chart
CN113642825A (en) * 2021-05-28 2021-11-12 浙江惠瀜网络科技有限公司 Supervision method suitable for vehicle loan cooperation mechanism
CN113450208A (en) * 2021-06-30 2021-09-28 中国建设银行股份有限公司 Loan risk change early warning and model training method and device
CN113888019A (en) * 2021-10-22 2022-01-04 山东大学 Personnel dynamic risk assessment method and system based on neural network

Similar Documents

Publication Publication Date Title
CN110738564A (en) Post-loan risk assessment method and device and storage medium
CN109255506B (en) Internet financial user loan overdue prediction method based on big data
CN110400021B (en) Bank branch cash usage prediction method and device
CN110400022B (en) Cash consumption prediction method and device for self-service teller machine
CN110852856B (en) Invoice false invoice identification method based on dynamic network representation
CN114139490B (en) Method, device and equipment for automatic data preprocessing
CN114048436A (en) Construction method and construction device for forecasting enterprise financial data model
CN111738331A (en) User classification method and device, computer-readable storage medium and electronic device
CN111738504A (en) Enterprise financial index fund amount prediction method and device, equipment and storage medium
CN111199469A (en) User payment model generation method and device and electronic equipment
CN112801775A (en) Client credit evaluation method and device
CN117787569B (en) Intelligent auxiliary bid evaluation method and system
CN110634060A (en) User credit risk assessment method, system, device and storage medium
CN111210332A (en) Method and device for generating post-loan management strategy and electronic equipment
CN116542800A (en) Intelligent financial statement analysis system based on cloud AI technology
CN115130887A (en) Reservoir dam environmental impact evaluation method and device, electronic equipment and storage medium
CN114612239A (en) Stock public opinion monitoring and wind control system based on algorithm, big data and artificial intelligence
CN114092230A (en) Data processing method and device, electronic equipment and computer readable medium
CN113421014A (en) Target enterprise determination method, device, equipment and storage medium
CN110796381B (en) Modeling method and device for wind control model, terminal equipment and medium
CN116843483A (en) Vehicle insurance claim settlement method, device, computer equipment and storage medium
CN111738824A (en) Method, device and system for screening financial data processing modes
CN116611911A (en) Credit risk prediction method and device based on support vector machine
CN115237970A (en) Data prediction method, device, equipment, storage medium and program product
CN114626940A (en) Data analysis method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Xinyada technology building, 3888 Jiangnan Avenue, Binjiang District, Hangzhou City, Zhejiang Province 310000

Applicant after: Sinyada Technology Co.,Ltd.

Address before: Xinyada technology building, 3888 Jiangnan Avenue, Binjiang District, Hangzhou City, Zhejiang Province 310000

Applicant before: SUNYARD SYSTEM ENGINEERING Co.,Ltd.

CB02 Change of applicant information
RJ01 Rejection of invention patent application after publication

Application publication date: 20200131

RJ01 Rejection of invention patent application after publication