CN108648074B - Loan assessment method, device and equipment based on support vector machine - Google Patents

Loan assessment method, device and equipment based on support vector machine Download PDF

Info

Publication number
CN108648074B
CN108648074B CN201810485676.2A CN201810485676A CN108648074B CN 108648074 B CN108648074 B CN 108648074B CN 201810485676 A CN201810485676 A CN 201810485676A CN 108648074 B CN108648074 B CN 108648074B
Authority
CN
China
Prior art keywords
loan
data
user
training
credit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810485676.2A
Other languages
Chinese (zh)
Other versions
CN108648074A (en
Inventor
张雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
OneConnect Financial Technology Co Ltd Shanghai
Original Assignee
OneConnect Financial Technology Co Ltd Shanghai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by OneConnect Financial Technology Co Ltd Shanghai filed Critical OneConnect Financial Technology Co Ltd Shanghai
Priority to CN201810485676.2A priority Critical patent/CN108648074B/en
Publication of CN108648074A publication Critical patent/CN108648074A/en
Application granted granted Critical
Publication of CN108648074B publication Critical patent/CN108648074B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Abstract

The application discloses a loan assessment method, a device and equipment based on a support vector machine, which relate to the technical field of data processing, can improve the loan assessment efficiency and accuracy, and has higher loan auditing passing rate. The method comprises the following steps: user data of different users are collected in advance, wherein the user data comprise user attribute data, historical credit data and user investment data of the different users; processing the user data to obtain training data conforming to a credit scoring model training standard; training by using a support vector machine algorithm based on the training data to obtain a credit scoring model; and when a loan request of a loan user is received, analyzing and obtaining the credit score and the loan limit of the loan user by utilizing the credit score model. The application is applicable to loan assessment.

Description

Loan assessment method, device and equipment based on support vector machine
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a loan assessment method, apparatus and device based on a support vector machine.
Background
The credit record refers to the explanation of the credit of the economic main body, which is expressed by a certain symbol or words after the credit rating mechanism relies on the information from a certain channel or each social party and can judge the credit status of the economic main body and evaluates the credit according to a certain standard and index.
Currently, when each financial institution evaluates personal credit records for loan evaluation, judgment is mainly performed by means of personal experience of a credit officer. However, not only is the efficiency low, but also the accuracy of loan assessment is affected due to the large number of uncontrollable factors of people, thereby restricting the development of financial institutions. Meanwhile, due to the fact that loan auditing of a loan officer is cautious, the loan auditing passing rate is too low, and the loan demands of vast users cannot be met, so that better experience cannot be brought to loan users.
Disclosure of Invention
In view of this, the present application provides a loan assessment method, apparatus and device based on a support vector machine, and mainly aims to solve the problems that the existing method for assessing a loan by relying on personal experience of a loan officer mainly assesses personal credit records is low in efficiency, and because of more uncontrollable factors of people, the accuracy of the loan assessment is affected, and the passing rate of loan audit is too low.
According to one aspect of the present application, there is provided a loan assessment method based on a support vector machine, the method comprising:
user data of different users are collected in advance, wherein the user data comprise user attribute data, historical credit data and user investment data of the different users;
processing the user data to obtain training data conforming to a credit scoring model training standard;
training by using a support vector machine algorithm based on the training data to obtain a credit scoring model;
and when a loan request of a loan user is received, analyzing and obtaining the credit score and the loan limit of the loan user by utilizing the credit score model.
According to another aspect of the present application, there is provided a loan assessment apparatus based on a support vector machine, the apparatus comprising:
the collecting unit is used for collecting user data of different users in advance, wherein the user data comprises user attribute data, historical credit data and user investment data of the different users;
the processing unit is used for processing the user data collected by the collecting unit to obtain training data which accords with the credit score model training standard;
the training unit is used for training by using a support vector machine algorithm based on the training data obtained by the processing unit to obtain a credit scoring model;
and the analysis unit is used for analyzing and obtaining the credit score and the loanable limit of the loan user by utilizing the credit score model obtained by the training unit when the loan request of the loan user is received.
According to still another aspect of the present application, there is provided a storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described loan assessment method based on a support vector machine.
According to still another aspect of the present application, there is provided an entity apparatus for loan assessment based on a support vector machine, including a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor, the processor implementing the above-mentioned loan assessment method based on a support vector machine when executing the program.
By means of the technical scheme, compared with the mode of carrying out loan assessment by assessing personal credit records mainly based on personal experience of a loan officer at present, the loan assessment method, device and equipment based on the support vector machine are used for processing different user data collected in advance to obtain training data, then training is carried out by utilizing a support vector machine algorithm based on the training data to obtain a credit score model, when a loan request of a loan user is received, the credit score model can be utilized for automatically analyzing and obtaining the credit score and the loan limit of the loan user, the loan assessment is not needed to be carried out manually by experience, the loan assessment efficiency and accuracy can be improved, the loan audit passing rate is higher, the loan requirement of vast users can be met, better use experience can be brought to the loan user, and the development of a financial institution is facilitated.
The foregoing description is only an overview of the technical solutions of the present application, and may be implemented according to the content of the specification in order to make the technical means of the present application more clearly understood, and in order to make the above-mentioned and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
fig. 1 shows a flowchart of a loan assessment method based on a support vector machine according to an embodiment of the present application;
fig. 2 is a schematic diagram of an overall architecture of a loan assessment method implementation based on a support vector machine according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a loan assessment device based on a support vector machine according to an embodiment of the present application.
Detailed Description
The present application will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other.
Aiming at the problems that the existing credit evaluation is carried out by mainly relying on personal experience of a credit officer, the efficiency is low, and the accuracy of the credit evaluation is affected due to the fact that a plurality of uncontrollable factors of people are large, and the passing rate of the credit audit is too low, the embodiment of the application provides a credit evaluation method based on a support vector machine, which can improve the efficiency and accuracy of the credit evaluation, and has higher passing rate of the credit audit, as shown in figure 1, the method comprises the following steps:
101. user data of different users is collected in advance.
The user data comprises user attribute data, historical credit data and user investment data of different users. The user attribute data may include the user's name, gender, age, occupation, average monthly income and consumption level, inframarginal property, residence, home, etc.; the historical credit data may include historical credit scores for the user, historical consumption and repayment records for the credit card, historical credit repayment conditions, fraud protection records, and the like; the user investment data may include user financial records, investment projects, investment amounts, and returns thereof, and the like.
In this embodiment, the user data of different users may be collected by querying user records existing in financial institutions, insurance companies, banks, shopping websites, etc., or by means of a third party platform, a user online questionnaire, etc. The apparatus or device for performing the loan assessment may be configured on the loan assessment side for performing the loan assessment on the loan requester based on the collected user data.
102. And processing the pre-collected user data to obtain training data meeting the credit score model training standard.
The credit scoring model can be a model trained based on a support vector machine algorithm.
For this embodiment, in order to train to obtain an accurate credit scoring model, preprocessing needs to be performed on the collected user data, for example, deleting or repairing invalid data, adjusting the format of the user data so as to train the model to read, and so on, so that the obtained data accords with standards such as training content, format, and so on of the credit scoring model.
103. Based on the training data obtained by processing, training is carried out by using a support vector machine algorithm to obtain a credit scoring model.
The support vector machine (Support Vector Machine, SVM) algorithm is similar to the neural network, and is a learning mechanism, but is different from the neural network in that the SVM uses a mathematical method and an optimization technology. In machine learning, a support vector machine (also supporting a vector network) is a supervised learning model associated with an associated learning algorithm that can analyze data, identify patterns, and use for classification and regression analysis.
The support vector machine algorithm is based on the VC dimension theory of statistical learning theory and the minimum structural risk theory, and based on limited sample information, an optimal compromise is sought between the complexity of the model (i.e., the learning accuracy for a specific training sample) and the learning ability (i.e., the ability to identify any sample without error) in order to obtain the best generalization ability. Therefore, in this embodiment, the credit score model for loan assessment can be better trained by using the support vector machine algorithm with reference to the characteristics of small samples, nonlinearity, high dimension, etc. of the user data collected in advance.
104. When a loan request of a loan user is received, the credit score model is utilized to analyze and obtain the credit score and the loan limit of the loan user.
The loan request carries user attribute data, historical credit data, user investment data and the like of a loan user, and the data are substituted into a credit scoring model to carry out loan assessment prediction, so that the loan user is classified, and credit scores and loanable amounts of the loan user are obtained. The overall architecture for implementing the method of this embodiment may be seen in fig. 2.
Furthermore, the loan request can also carry loan fund use information of the loan user, and the credit rating of the loan user can be better estimated by combining the analysis result obtained by the credit rating model, so that the credit limit adjustment can be actively carried out in real time. For example, if the loan fund use of the loan user is to purchase goods, the loan user can be provided with a loan according to the loan limit full amount obtained by the analysis by combining the analysis result obtained by the credit score model; if the loan fund of the loan user is used for repaying the credit card, which indicates that the loan user is insufficient in fund and possibly has bad account, the loan user can be provided with the loan according to 80% of the analyzed loanable amount.
Compared with the existing method for evaluating the loan by evaluating the personal credit records mainly based on the personal experience of the loan officer, the loan evaluation method based on the support vector machine can automatically analyze and obtain the credit score and the loan limit of the loan officer by using the credit score model, does not need to manually evaluate the loan by experience, can improve the loan evaluation efficiency and accuracy, has higher loan auditing passing rate, can be applied to daily product purchasing payment scenes, online loan scenes and the like of the officer, and provides better fund payment capability for the officer through real-time data analysis and flexible operation activities. The method can bring better use experience to loan users, and is beneficial to the development of financial institutions and electronic commerce.
Further, as a refinement and extension of the foregoing embodiment, to illustrate the data processing procedure in step 102, in an alternative manner, step 102 may specifically include: performing interpolation missing value processing on data with missing values in the user data by using a multiple padding method; analyzing the user data subjected to interpolation missing value processing by using a box diagram, identifying abnormal data in the box diagram and deleting the abnormal data; and carrying out logarithmic transformation and normalization processing on the user data after deleting the abnormal data to obtain training data meeting the training standard of the credit scoring model.
The multiple padding method is a process of replacing each missing value with a vector including m interpolation values, and requires m to be equal to or greater than a certain value. m complete data sets can be created from the interpolated vectors. m complete data sets can be created from the interpolated vectors; each missing value is replaced by a first element of the vector to create a first complete data set, each missing value is replaced by a second element of its vector to create a second complete data set, and so on, a standard complete data method is used to analyze each data set.
Box-plot (Box-plot), also known as Box whisker plot, box plot or Box plot, is a statistical plot used as a data showing a set of data dispersion conditions, which is drawn by using commonly used statistics, can provide key information about data location and dispersion conditions, and can more manifest its differences especially when comparing different parent data. The figure mainly comprises six data nodes, a group of data is arranged from big to small, and the upper edge, the upper quartile Q3, the middle digit, the lower quartile Q1, the lower edge and the abnormal value of the data are calculated respectively. For the embodiment, abnormal data in the user data can be well removed through box graph analysis, and adverse effects on the user data result are reduced.
After deleting the abnormal data, logarithmic transformation and normalization processing can be performed to induce the statistical distribution of the user data, and the training data meeting the credit scoring model training standard is obtained. The normalization processing is to limit the processed data to a certain range, the logarithmic transformation and the normalization processing are convenient for the subsequent data training, and the convergence is accelerated when the program is running.
Based on the training data obtained in step 102, to illustrate the credit score model specific generation process, in an alternative manner, step 103 may specifically include: based on the training data, a plurality of training sets are constructed by utilizing a Bagging algorithm, wherein the training sets are obtained by training a plurality of samples randomly extracted from the training data, and each round of training is used for obtaining one training set; randomly selecting a predetermined number of characteristic samples from each training set to perform kernel function modeling, wherein the characteristic samples comprise positive samples and negative samples; after the training set is modeled, solving a kernel function through a mode search algorithm to obtain an optimal hyperplane, wherein the optimal hyperplane can divide positive samples and negative samples randomly selected in the training set, and the interval between the divided positive and negative sample vectors and the optimal hyperplane is maximized; generating a classifier corresponding to each training set based on the function of the optimal hyperplane corresponding to each training set; finally, randomly sampling from the training data to obtain verification data, and calculating the correlation coefficient between each classifier and other multiple classifiers by using the verification data; and selecting a group of classifiers with minimum correlation coefficients for integration to obtain a credit scoring model.
The Bagging algorithm is a method used to improve the accuracy of learning algorithms by constructing a series of predictive functions and then combining them into a predictive function in some way. In the embodiment, initial sample data is disturbed through a Bagging algorithm, and an autonomous sampling technology is adopted to generate a plurality of different training sets; and then randomly selecting a fixed number of positive and negative samples in each training set to perform kernel function modeling so as to find an optimal hyperplane through a kernel function. For this embodiment, the definition of the positive sample may be set in connection with data that is strongly related to the user's credit, while the negative sample may be set in connection with data that is weakly related to the user's credit.
The kernel function is solved by a pattern search algorithm to obtain an optimal hyperplane, the pattern search algorithm simply searches a series of points X0, X1, X2 and …, the points are all closer to the optimal value point, and the last point is taken as the solution of the current search when the search is carried out to a termination condition. In this embodiment, an optimal hyperplane is obtained through a mode search algorithm, which shortens the construction time of the credit scoring model based on the support vector machine while guaranteeing the prediction accuracy of the credit scoring model.
The generalization capability is used for representing the prediction capability of the credit scoring model, in order to improve the generalization capability of the credit scoring model, a support vector machine is used as a base learner in the embodiment, and a mode search algorithm is used for solving the optimal parameters of the kernel function to obtain an optimal hyperplane. The optimal parameters are ensured to be searched on each training set by applying a mode searching method, so that the generalization capability of a single classifier is improved; the classifiers trained by different training sets are not identical, and the difference between individual classifiers is enhanced. The relevance minimization algorithm further selects classifiers with large differences from a plurality of body classifiers to integrate, and the generalization capability of the credit scoring model is further improved.
In order to facilitate rapid loan assessment, further, the loan request can carry the identity information of the loan user, wherein the identity information can be the name, ID number and the like for representing the identity of the loan user; accordingly, the process of obtaining the credit rating result of the loan user by using the credit rating model in step 104 may specifically include: inquiring whether user data of the loan user exists in different user data collected in advance for training a credit scoring model by using the identity information of the loan user; if so, substituting the existing user data of the loan user into a credit score model for classification, and determining the type result of the loan user; and finally, respectively determining the credit score corresponding to the type result and the loanable amount corresponding to the credit score segment to which the credit score belongs as the credit score and the loanable amount of the loan user, wherein different types of results respectively correspond to different credit scores, and different credit score segments respectively correspond to different loanable amounts.
By the method, the loan user can evaluate the loan by using the credit scoring model without manually inputting excessive loan request information, and the operation of the loan user is convenient.
For the embodiment, in order to accurately evaluate the credit score and the loanable amount of the loan user by applying the credit score model, it is necessary to ensure that the user data of the loan user also participates in training the credit score model, so that the user data of the loan user can be accurately classified based on the loan user data, and substituted into the credit score model for analysis, so that the credit score and the loanable amount of the loan user can be accurately classified. It should be noted that, besides adopting the way that the credit score segment corresponds to the loanable amount, the credit score can also be adopted to directly correspond to the loanable amount, and different credit scores can respectively correspond to different loanable amounts (or can correspond to the same loanable amount), and can be specifically selected according to actual service requirements.
If the user data of the loan user does not exist in different user data collected in advance for training the credit score model, the credit score model can only realize simple prediction, and in order to improve the accuracy of loan assessment in such a scene, an optional mode is that a query request for querying the credit score of the third party of the loan user is sent to the third party credit platform according to the identity information of the loan user, so that the third party credit platform queries the credit score of the third party corresponding to the loan user according to the identity information; then receiving a third party credit score corresponding to the loan user returned by the third party credit platform; converting the third-party credit score to obtain a credit score conforming to a credit score model score format; and respectively determining the credit score obtained through conversion and the loanable amount corresponding to the credit score segment to which the credit score belongs as the credit score and the loanable amount of the loan user.
For this embodiment, if the user data of the loan user does not participate in training the credit score model, the credit score model may be replaced by a third party credit score for prediction, so that the accuracy of the loan assessment may be improved.
If the loan user has a record of unreliability, the non-actual loan amount of the loan amount obtained by the credit score model is indicated, and in order to meet specific service requirements more closely to specific use scenes, an alternative way is to determine the loan amount corresponding to the credit score section to which the credit score belongs as the theoretical maximum loan amount of the loan user, and determine the difference between the maximum loan amount and the non-repayment amount of the loan user as the actual loan amount of the loan user. The analysis result obtained in this way meets business requirements better, so that the loan user knows what the actual loan amount is.
Further, in order to meet the update requirement of the credit score model, the method in this embodiment may further include: when the user data of the new user is collected and/or the old user data is updated, the user data of the new user and/or the updated old user data are processed to obtain new training data which accords with the credit score model training standard; training by using a support vector machine algorithm based on the new training data and the old training data to obtain an updated credit scoring model; the corresponding step 104 may specifically include: when a loan request of a loan user is received, the credit score and the loanable amount of the loan user are obtained by analysis by utilizing the updated credit score model, so that the accuracy of the obtained analysis result is higher, and the updating requirement of the credit score model is met.
Further, as a specific implementation of the methods of fig. 1 and fig. 2, an embodiment of the present application provides a loan assessment device based on a support vector machine, as shown in fig. 3, where the device includes: a collecting unit 21, a processing unit 22, a training unit 23, an analyzing unit 24.
A collecting unit 21, which may be configured to collect user data of different users in advance, wherein the user data includes user attribute data, historical credit data, and user investment data of different users; the collection unit 21 is a main functional module for collecting user data in the device, and can collect user data of different users by querying user records existing in financial institutions, insurance companies, banks, shopping websites and the like, or by means of a third party platform, a user online questionnaire and the like.
The processing unit 22 may be configured to process the user data collected by the collecting unit 21 to obtain training data that meets the training standard of the credit score model; the processing unit 22 is a functional module for preprocessing data in the device.
The training unit 23 may be configured to perform training by using a support vector machine algorithm to obtain a credit score model based on the training data obtained by the processing unit 22; the training unit 23 is a main functional module for training the credit scoring model based on the support vector machine algorithm, and is also a core module of the device.
The analysis unit 24 may be configured to, when receiving a loan request of a loan user, analyze and obtain a credit score and a loanable amount of the loan user using the credit score model obtained by the training unit 23. The analysis unit 24 is the main functional module for loan assessment in the present device.
In a specific application scenario, the processing unit 22 may be specifically configured to perform interpolation missing value processing on data with missing values in the user data by using a multiple interpolation method; analyzing the user data subjected to interpolation missing value processing by using a box diagram, identifying abnormal data in the box diagram and deleting the abnormal data; and carrying out logarithmic transformation and normalization processing on the user data after deleting the abnormal data to obtain training data meeting the training standard of the credit scoring model. For the embodiment, abnormal data in the user data can be well removed through box graph analysis, and adverse effects on the user data result are reduced. The logarithmic transformation and normalization processing is used for facilitating the subsequent data training, and secondly, the convergence is ensured to be quickened when the program runs.
In a specific application scenario, the training unit 23 may be specifically configured to construct a plurality of training sets by using a Bagging algorithm based on training data, where the training sets are obtained by training a plurality of samples randomly extracted from the training data, and each training round obtains a training set; randomly selecting a predetermined number of characteristic samples from each training set to perform kernel function modeling, wherein the characteristic samples comprise positive samples and negative samples; after the training set is modeled, solving a kernel function through a mode search algorithm to obtain an optimal hyperplane, wherein the optimal hyperplane can divide positive samples and negative samples randomly selected in the training set, and the interval between the positive samples and the negative samples after the division and the optimal hyperplane is the largest; generating a classifier corresponding to each training set based on the function of the optimal hyperplane corresponding to each training set; randomly sampling from the training data to obtain verification data, and calculating correlation coefficients between each classifier and other multiple classifiers by using the verification data; and selecting a group of classifiers with minimum correlation coefficients for integration to obtain a credit scoring model.
In the embodiment, a support vector machine is used as a base learner, and a mode search algorithm is used for solving the optimal parameters of the kernel function to obtain an optimal hyperplane. The optimal parameters are ensured to be searched on each training set by applying a mode searching method, so that the generalization capability of a single classifier is improved; the classifiers trained by different training sets are not identical, and the difference between individual classifiers is enhanced. The relevance minimization algorithm further selects classifiers with large differences from a plurality of body classifiers to integrate, and the generalization capability of the credit scoring model is further improved.
In a specific application scenario, optionally, the loan request carries identity information of the loan user, and the analysis unit 24 may specifically be configured to query whether user data of the loan user exists in different user data collected in advance for training the credit score model by using the identity information; if so, substituting the user data of the existing loan users into the credit score model for classification, and determining the type result of the loan users; and respectively determining the credit score corresponding to the type result and the loanable amount corresponding to the credit score segment to which the credit score belongs as the credit score and the loanable amount of the loan user, wherein different types of results respectively correspond to different credit scores, and different credit score segments respectively correspond to different loanable amounts. By the method, the loan user can evaluate the loan by using the credit scoring model without manually inputting excessive loan request information, and the operation of the loan user is convenient.
In a specific application scenario, the analysis unit 24 may be further specifically configured to send a query request for querying a credit score of a third party of the loan user to the third party credit platform according to the identity information if no user data of the loan user exists in different user data collected in advance for training the credit score model; receiving a third party credit score corresponding to the loan user returned by the third party credit platform; converting the third-party credit score to obtain a credit score conforming to a credit score model score format; and respectively determining the credit score obtained through conversion and the loanable amount corresponding to the credit score segment to which the credit score belongs as the credit score and the loanable amount of the loan user.
For this embodiment, if the user data of the loan user does not participate in training the credit score model, the credit score model may be replaced by a third party credit score for prediction, so that the accuracy of the loan assessment may be improved.
In a specific application scenario, the analysis unit 24 may be further configured to determine, if there is a record of unrevealed loan, a loanable amount corresponding to a credit score segment to which the credit score belongs as a theoretical maximum loanable amount of the loan user, and determine, as an actual loanable amount of the loan user, a difference between the maximum loanable amount and the unrevealed amount of the loan user. The analysis result obtained in this way meets business requirements better, so that the loan user knows what the actual loan amount is.
In a specific application scenario, in order to meet the update requirement of the credit score model, the processing unit 22 may be further configured to process, when user data of a new user is collected and/or update occurs on old user data, the user data of the new user and/or the updated old user data to obtain new training data that meets the training standard of the credit score model;
the training unit 23 is further configured to perform training by using a support vector machine algorithm based on the new training data and the old training data to obtain an updated credit score model;
accordingly, the analysis unit 24 may be further configured to, when receiving a loan request of a loan user, analyze and obtain a credit score and a loanable amount of the loan user using the updated credit score model.
It should be noted that, other corresponding descriptions of each functional unit related to the loan assessment device based on the support vector machine provided in the embodiments of the present application may refer to corresponding descriptions in fig. 1 to 2, and are not repeated here.
Based on the method shown in fig. 1, correspondingly, the embodiment of the application further provides a storage medium, on which a computer program is stored, and when the program is executed by a processor, the loan assessment method based on the support vector machine shown in fig. 1 is implemented.
Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.), and includes several instructions for causing a computer device (may be a personal computer, a server, or a network device, etc.) to perform the methods described in various implementation scenarios of the present application.
Based on the method shown in fig. 1 and the virtual device embodiment shown in fig. 3, in order to achieve the above objective, the embodiment of the present application further provides an entity device for loan assessment based on a support vector machine, which may specifically be a terminal such as a personal computer, a server, a network device, etc., where the entity device includes a storage medium and a processor; a storage medium storing a computer program; and a processor for executing a computer program to implement the loan assessment method based on the support vector machine as shown in fig. 1.
Optionally, the physical device may further include a user interface, a network interface, a camera, radio Frequency (RF) circuitry, sensors, audio circuitry, WI-FI modules, and the like. The user interface may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, etc. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), etc.
It will be appreciated by those skilled in the art that the structure of the entity device for supporting loan assessment by the vector machine according to the present embodiment is not limited to the entity device, and may include more or fewer components, or may combine some components, or may be arranged with different components.
The storage medium may also include an operating system, a network communication module. The operating system is a program for managing the loan assessment entity device hardware and software resources based on the support vector machine, and supports the operation of information processing programs and other software and/or programs. The network communication module is used for realizing communication among all components in the storage medium and communication with other hardware and software in the information processing entity equipment.
From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented by means of software plus necessary general hardware platforms, or may be implemented by hardware. By applying the technical scheme, compared with the current method for carrying out loan assessment by mainly assessing personal credit records by means of personal experience of a loan officer, the credit score model can be utilized to automatically analyze and obtain credit scores and loan limit of the loan officer, the loan assessment is not needed to be carried out manually by experience, the loan assessment efficiency and accuracy can be improved, the loan auditing passing rate is higher, the method can be applied to daily product purchasing payment scenes, online loan scenes and the like of the officer, and better fund payment capability is provided for the officer through real-time data analysis and flexible operation activities. The method can bring better use experience to loan users, and is beneficial to the development of financial institutions and electronic commerce.
Those skilled in the art will appreciate that the drawings are merely schematic illustrations of one preferred implementation scenario, and that the modules or flows in the drawings are not necessarily required to practice the present application. Those skilled in the art will appreciate that modules in an apparatus in an implementation scenario may be distributed in an apparatus in an implementation scenario according to an implementation scenario description, or that corresponding changes may be located in one or more apparatuses different from the implementation scenario. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.
The foregoing application serial numbers are merely for description, and do not represent advantages or disadvantages of the implementation scenario. The foregoing disclosure is merely a few specific implementations of the present application, but the present application is not limited thereto and any variations that can be considered by a person skilled in the art shall fall within the protection scope of the present application.

Claims (9)

1. A loan assessment method based on a support vector machine, comprising:
user data of different users are collected in advance, wherein the user data comprise user attribute data, historical credit data and user investment data of the different users;
processing the user data to obtain training data conforming to a credit scoring model training standard;
training by using a support vector machine algorithm based on the training data to obtain a credit scoring model;
when a loan request of a loan user is received, analyzing and obtaining a credit score and a loanable amount of the loan user by utilizing the credit score model;
based on the training data, training by using a support vector machine algorithm to obtain a credit scoring model, which specifically comprises the following steps:
based on the training data, constructing a plurality of training sets by utilizing a Bagging algorithm, wherein the training sets are obtained by training a plurality of samples randomly extracted from the training data, and each round of training is used for obtaining one training set;
randomly selecting a preset number of characteristic samples from each training set to perform kernel function modeling, wherein the characteristic samples comprise positive samples and negative samples;
after the training set is modeled, solving a kernel function through a mode search algorithm to obtain an optimal hyperplane, wherein the optimal hyperplane can divide positive samples and negative samples randomly selected in the training set, and the interval between the positive samples and the negative samples after the division and the optimal hyperplane is the largest;
generating a classifier corresponding to each training set based on the function of the optimal hyperplane corresponding to each training set;
randomly sampling from the training data to obtain verification data, and calculating correlation coefficients between each classifier and other multiple classifiers by using the verification data;
and selecting a group of classifiers with minimum correlation coefficients for integration to obtain a credit scoring model.
2. The method according to claim 1, wherein processing the user data to obtain training data meeting credit score model training criteria comprises:
performing interpolation missing value processing on the data with missing values in the user data by using a multiple padding method;
analyzing the user data processed by the interpolation missing values by using a box diagram, identifying abnormal data in the box diagram and deleting the abnormal data;
and carrying out logarithmic transformation and normalization processing on the user data after deleting the abnormal data to obtain training data meeting the training standard of the credit scoring model.
3. The method according to claim 1, wherein the loan request carries identity information of the loan user, and the credit score model is used to analyze and obtain the credit score and the loan limit of the loan user, and the method specifically comprises:
inquiring whether user data of the loan user exists in different user data collected in advance for training the credit scoring model by using the identity information;
if so, substituting the existing user data of the loan user into the credit scoring model to classify, and determining the type result of the loan user;
and respectively determining the credit score corresponding to the type result and the loanable amount corresponding to the credit score section to which the credit score belongs as the credit score and the loanable amount of the loan user, wherein different types of results respectively correspond to different credit scores, and different credit score sections respectively correspond to different loanable amounts.
4. The method of claim 3, wherein after querying, using the identification information, whether there is user data of the loan user among different user data collected in advance for training the credit scoring model, the method further comprises:
if the credit score does not exist, sending a query request for querying the credit score of the third party of the loan user to a third party credit platform according to the identity information;
receiving a third party credit score corresponding to the loan user returned by the third party credit platform;
converting the third-party credit score to obtain a credit score conforming to the credit score model score format;
and respectively determining the credit score obtained through conversion and the loanable amount corresponding to the credit score segment to which the credit score belongs as the credit score and the loanable amount of the loan user.
5. The method of claim 1, wherein if the loan user has a record of unremoved, determining the loanable amount corresponding to the credit score segment to which the credit score belongs as the loanable amount of the loan user, specifically comprising:
and determining the loanable amount corresponding to the credit score section to which the credit score belongs as the theoretical maximum loanable amount of the loan user, and determining the difference between the maximum loanable amount and the unrendered amount of the loan user as the actual loanable amount of the loan user.
6. The method according to any one of claims 1 to 5, further comprising:
when user data of a new user is collected and/or old user data is updated, processing the user data of the new user and/or the updated old user data to obtain new training data which accords with credit score model training standards;
training by using a support vector machine algorithm based on the new training data and the old training data to obtain an updated credit scoring model;
when receiving a loan request of a loan user, analyzing and obtaining a credit score and a loanable amount of the loan user by using the credit score model, wherein the method specifically comprises the following steps of:
and when a loan request of a loan user is received, analyzing and obtaining the credit score and the loan limit of the loan user by utilizing the updated credit score model.
7. A loan assessment device based on a support vector machine, comprising: the collecting unit is used for collecting user data of different users in advance, wherein the user data comprises user attribute data, historical credit data and user investment data of the different users;
the processing unit is used for processing the user data collected by the collecting unit to obtain training data which accords with the credit score model training standard;
the training unit is used for training by using a support vector machine algorithm based on the training data obtained by the processing unit to obtain a credit scoring model;
the analysis unit is used for analyzing and obtaining credit scores and loanable amounts of the loan users by utilizing the credit score model obtained by the training unit when the loan requests of the loan users are received;
the training unit is further used for constructing a plurality of training sets by utilizing a Bagging algorithm based on the training data, wherein the training sets are obtained by training a plurality of samples randomly extracted from the training data, and each round of training is used for obtaining one training set; randomly selecting a preset number of characteristic samples from each training set to perform kernel function modeling, wherein the characteristic samples comprise positive samples and negative samples; after the training set is modeled, solving a kernel function through a mode search algorithm to obtain an optimal hyperplane, wherein the optimal hyperplane can divide positive samples and negative samples randomly selected in the training set, and the interval between the positive samples and the negative samples after the division and the optimal hyperplane is the largest; generating a classifier corresponding to each training set based on the function of the optimal hyperplane corresponding to each training set; randomly sampling from the training data to obtain verification data, and calculating correlation coefficients between each classifier and other multiple classifiers by using the verification data; and selecting a group of classifiers with minimum correlation coefficients for integration to obtain a credit scoring model.
8. A storage medium having stored thereon a computer program, wherein the program when executed by a processor implements the support vector machine-based loan assessment method of any of claims 1 to 6.
9. A support vector machine-based loan assessment apparatus comprising a storage medium, a processor and a computer program stored on the storage medium and executable on the processor, wherein the processor implements the support vector machine-based loan assessment method of any one of claims 1 to 6 when executing the program.
CN201810485676.2A 2018-05-18 2018-05-18 Loan assessment method, device and equipment based on support vector machine Active CN108648074B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810485676.2A CN108648074B (en) 2018-05-18 2018-05-18 Loan assessment method, device and equipment based on support vector machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810485676.2A CN108648074B (en) 2018-05-18 2018-05-18 Loan assessment method, device and equipment based on support vector machine

Publications (2)

Publication Number Publication Date
CN108648074A CN108648074A (en) 2018-10-12
CN108648074B true CN108648074B (en) 2023-06-09

Family

ID=63757180

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810485676.2A Active CN108648074B (en) 2018-05-18 2018-05-18 Loan assessment method, device and equipment based on support vector machine

Country Status (1)

Country Link
CN (1) CN108648074B (en)

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109584037A (en) * 2018-10-16 2019-04-05 深圳壹账通智能科技有限公司 Calculation method, device and the computer equipment that user credit of providing a loan scores
CN109636576A (en) * 2018-10-25 2019-04-16 深圳壹账通智能科技有限公司 Processing method, device, equipment and the storage medium of credit data
CN109816509A (en) * 2018-12-14 2019-05-28 平安科技(深圳)有限公司 Generation method, terminal device and the medium of scorecard model
CN109685649A (en) * 2018-12-28 2019-04-26 上海点融信息科技有限责任公司 The method, apparatus and storage medium of the accrediting amount are determined based on artificial intelligence
CN109886799A (en) * 2019-01-22 2019-06-14 上海上湖信息技术有限公司 A kind of real-time method and system predicted and show loaning bill success rate
CN111583010A (en) * 2019-02-18 2020-08-25 北京奇虎科技有限公司 Data processing method, device, equipment and storage medium
CN112930545A (en) * 2019-02-19 2021-06-08 算话智能科技有限公司 System and method for credit evaluation
CN110060144B (en) * 2019-03-18 2024-01-30 平安科技(深圳)有限公司 Method for training credit model, method, device, equipment and medium for evaluating credit
CN110135970A (en) * 2019-04-15 2019-08-16 深圳壹账通智能科技有限公司 Loan valuation method, apparatus, computer equipment and storage medium
CN112115258B (en) * 2019-06-20 2023-09-26 腾讯科技(深圳)有限公司 Credit evaluation method and device for user, server and storage medium
CN110443694A (en) * 2019-07-31 2019-11-12 中国工商银行股份有限公司 Financing method and device on little Wei enterprise line
CN111210337A (en) * 2019-12-27 2020-05-29 安徽科讯金服科技有限公司 User credit evaluation system for loan
CN111222979A (en) * 2019-12-27 2020-06-02 安徽科讯金服科技有限公司 Loan credit evaluation system based on government affair big data
CN111275545A (en) * 2020-02-14 2020-06-12 中国建设银行股份有限公司 Method, apparatus, device and medium for on-line mortgage
CN111340616B (en) * 2020-03-10 2024-03-19 中国建设银行股份有限公司 Method, device, equipment and medium for approving online loan
CN111583015A (en) * 2020-04-09 2020-08-25 上海淇毓信息科技有限公司 Credit application classification method and device and electronic equipment
TWI759759B (en) * 2020-06-09 2022-04-01 台北富邦商業銀行股份有限公司 Enterprise Loan Evaluation System
CN112163943A (en) * 2020-09-17 2021-01-01 中国建设银行股份有限公司 Method, device, equipment and medium for determining default probability
CN112163944A (en) * 2020-09-18 2021-01-01 中国建设银行股份有限公司 Loan qualification scoring method and device for customer, computer equipment and storage medium
CN112347343A (en) * 2020-09-25 2021-02-09 北京淇瑀信息科技有限公司 Customized information pushing method and device and electronic equipment
CN112700319A (en) * 2020-12-16 2021-04-23 中国建设银行股份有限公司 Enterprise credit line determination method and device based on government affair data
CN112950354A (en) * 2021-02-26 2021-06-11 中国光大银行股份有限公司 Credit scoring method and device for account, storage medium and electronic device
CN112907358A (en) * 2021-03-17 2021-06-04 平安消费金融有限公司 Loan user credit scoring method, loan user credit scoring device, computer equipment and storage medium
CN112862602A (en) * 2021-03-29 2021-05-28 中信银行股份有限公司 User request determining method, storage medium and electronic device
CN113129126B (en) * 2021-04-15 2023-04-25 算话智能科技有限公司 Service data processing method and device
CN113177837A (en) * 2021-05-12 2021-07-27 广州市全民钱包科技有限公司 Loan amount evaluation method, device, equipment and storage medium for loan applicant
CN113283979A (en) * 2021-05-12 2021-08-20 广州市全民钱包科技有限公司 Loan credit evaluation method and device for loan applicant and storage medium
CN113887981A (en) * 2021-10-14 2022-01-04 黑龙江省范式智能技术有限公司 Enterprise credit line standard analysis method
CN114387077A (en) * 2021-11-29 2022-04-22 阿尔法时刻科技(深圳)有限公司 Credit qualification examination method and system
CN116468265A (en) * 2023-03-23 2023-07-21 杭州瓴羊智能服务有限公司 Batch user data processing method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107248114A (en) * 2017-06-01 2017-10-13 世纪禾光科技发展(北京)有限公司 Electric business loan administration method and system
CN107798600A (en) * 2017-12-05 2018-03-13 深圳信用宝金融服务有限公司 The credit risk recognition methods of the small micro- loan of internet finance and device
CN107967461A (en) * 2017-12-08 2018-04-27 深圳云天励飞技术有限公司 The training of SVM difference models and face verification method, apparatus, terminal and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107248114A (en) * 2017-06-01 2017-10-13 世纪禾光科技发展(北京)有限公司 Electric business loan administration method and system
CN107798600A (en) * 2017-12-05 2018-03-13 深圳信用宝金融服务有限公司 The credit risk recognition methods of the small micro- loan of internet finance and device
CN107967461A (en) * 2017-12-08 2018-04-27 深圳云天励飞技术有限公司 The training of SVM difference models and face verification method, apparatus, terminal and storage medium

Also Published As

Publication number Publication date
CN108648074A (en) 2018-10-12

Similar Documents

Publication Publication Date Title
CN108648074B (en) Loan assessment method, device and equipment based on support vector machine
CN108564286B (en) Artificial intelligent financial wind-control credit assessment method and system based on big data credit investigation
WO2020119272A1 (en) Risk identification model training method and apparatus, and server
CN108711107A (en) Intelligent financing services recommend method and its system
CN104321794A (en) A system and method using multi-dimensional rating to determine an entity's future commercial viability
Kostic et al. What image features boost housing market predictions?
WO2020135642A1 (en) Model training method and apparatus employing generative adversarial network
CN113469730A (en) Customer repurchase prediction method and device based on RF-LightGBM fusion model under non-contract scene
CN111626767B (en) Resource data issuing method, device and equipment
CN112449002B (en) Method, device and equipment for pushing object to be pushed and storage medium
CN116739794B (en) User personalized scheme recommendation method and system based on big data and machine learning
CN117132383A (en) Credit data processing method, device, equipment and readable storage medium
CN113011966A (en) Credit scoring method and device based on deep learning
CN109978300B (en) Customer risk tolerance quantification method and system, and asset configuration method and system
CN115689708A (en) Screening method, risk assessment method, device, equipment and medium of training data
Koç et al. Consumer loans' first payment default detection: a predictive model
CN114626940A (en) Data analysis method and device and electronic equipment
Dixon et al. A Bayesian approach to ranking private companies based on predictive indicators
Guo et al. Explainable recommendation systems by generalized additive models with manifest and latent interactions
Zimal et al. Customer churn prediction using machine learning
KR101656024B1 (en) Matching apparatus and method for mate candidate
CN116384749A (en) Method for training risk rating prediction model and computing equipment
CN116385151A (en) Method and computing device for risk rating prediction based on big data
Aljifri Predicting Customer Churn in a Subscription-Based E-Commerce Platform Using Machine Learning Techniques
CN116384750A (en) Method and computing device for generating marking sample and training risk rating prediction model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant