CN112348685A - Credit scoring method, device, equipment and storage medium - Google Patents

Credit scoring method, device, equipment and storage medium Download PDF

Info

Publication number
CN112348685A
CN112348685A CN202011081209.7A CN202011081209A CN112348685A CN 112348685 A CN112348685 A CN 112348685A CN 202011081209 A CN202011081209 A CN 202011081209A CN 112348685 A CN112348685 A CN 112348685A
Authority
CN
China
Prior art keywords
preset
credit
data
target application
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011081209.7A
Other languages
Chinese (zh)
Inventor
刘新儒
刘圣军
任燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202011081209.7A priority Critical patent/CN112348685A/en
Publication of CN112348685A publication Critical patent/CN112348685A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention relates to the technical field of credit evaluation, and discloses a credit scoring method, a device, equipment and a storage medium, which are used for improving the accuracy of a financial institution in evaluating a client credit. The method comprises the following steps: acquiring target application client data; screening characteristic variables of the target application client data according to preset characteristic variables; processing the screened characteristic variables according to preset variable standardization parameters in a data standardization processing mode to obtain corresponding standardized data; inputting the standardized data into a preset expert group decision model for prediction, and outputting the corresponding conservation probability of the target application client; and inputting the conservation probability and the historical credit record of the target application client into a preset credit scoring model, and outputting the credit score of the target application client.

Description

Credit scoring method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of credit evaluation, in particular to a credit scoring method, a credit scoring device, credit scoring equipment and a credit scoring storage medium.
Background
The credit score is obtained by using a certain credit score model according to the credit history data of the client. Based on the credit score of the customer, the credit provider may analyze the likelihood that the customer will pay on time. Accordingly, the credit provider can decide whether to grant credit and the amount and interest rate of the credit. Although the credit grantor can also obtain the analysis result by analyzing the credit history data of the client, the credit scoring is faster, more objective and more consistent. The credit scoring model is used as the basis and the core of credit risk management, and has irreplaceable main purposes for establishing a social credit investigation system and managing credit assets of financial institutions.
At present, the ability of quantifying and analyzing the credit risk of a client is lacked, the existing credit scoring model has complex credit evaluation on a new client and has the condition of low credit scoring accuracy, and the existing credit scoring model is difficult to meet the requirement of the existing financial institution on the credit evaluation of an application client.
Disclosure of Invention
The invention mainly aims to realize automatic scoring of personal credit so as to improve the accuracy of a financial institution in evaluating the credit of a client.
To achieve the above object, a first aspect of the present invention provides a credit scoring method, including:
acquiring target application client data;
screening characteristic variables of the target application client data according to preset characteristic variables;
processing the screened characteristic variables according to preset variable standardization parameters in a data standardization processing mode to obtain corresponding standardized data;
inputting the standardized data into a preset expert group decision model for prediction, and outputting the corresponding conservation probability of the target application client;
and inputting the conservation probability and the historical credit record of the target application client into a preset credit scoring model, and outputting the credit score of the target application client.
Optionally, in another implementation manner of the first aspect of the present invention, the credit scoring method further includes:
and inputting the credit score of the target application client into a preset client credit grade standard model, and outputting the credit granting grade of the target application client.
Optionally, in another implementation manner of the first aspect of the present invention, the credit scoring method further includes:
and inputting the credit granting grade of the target application client into a preset grade standard execution interest rate level model, and outputting the execution interest rate level of the target application client.
Optionally, in another implementation manner of the first aspect of the present invention, before the obtaining the target application client data, the method further includes:
sampling and acquiring client data of known credit categories from a preset database to form an original data sample;
processing variables in the original data samples according to a data standardization processing mode to generate corresponding standardized samples, outputting and recording standardized parameters of each variable, wherein the preset standardized parameters of the variables are standardized parameters of each variable;
extracting characteristic variables of the standardized samples according to a preset extraction model to generate corresponding training samples, outputting and recording the selected characteristic variables, wherein the preset characteristic variables are the selected characteristic variables;
and inputting the training samples into a preset probabilistic neural network model for model training to obtain the preset expert group decision model.
Optionally, in another implementation manner of the first aspect of the present invention, the data normalization processing manner includes variable virtualization processing of attribute data and dimensionless processing of numerical data;
the preset extraction model is a partial least square regression model.
Optionally, in another implementation manner of the first aspect of the present invention, the inputting the training sample into a preset probabilistic neural network model for model training specifically includes:
using a Bootstrap algorithm to repeatedly sample the training samples to form a plurality of corresponding calculation samples;
and inputting each calculation sample into a preset probability neural network model for model training to obtain a plurality of corresponding target probability neural network models, wherein the preset expert group decision model is a set of the target probability neural network models.
Optionally, in another implementation manner of the first aspect of the present invention, the inputting the normalized data into a preset expert group decision model for prediction, and outputting the conservative probability corresponding to the target application client specifically includes:
inputting the standardized data into each target probability neural network model, and outputting a plurality of corresponding target conservation probabilities;
and solving a weighted average of all the target conservation probabilities to obtain the corresponding conservation probability of the target application client.
The second aspect of the present invention provides a credit scoring apparatus, including:
the target client data acquisition module is used for acquiring target application client data;
the characteristic variable screening module is used for screening the characteristic variables of the target application client data according to preset characteristic variables;
the standardized data acquisition module is used for processing the screened characteristic variables according to preset variable standardized parameters in a data standardized processing mode to obtain corresponding standardized data;
the conservation probability prediction module is used for inputting the standardized data into a preset expert group decision model for prediction and outputting the conservation probability corresponding to the target application client;
and the credit score output module is used for inputting the conservation probability and the historical credit record of the target application client into a preset credit score model and outputting the credit score of the target application client.
Optionally, in another implementation manner of the second aspect of the present invention, the credit scoring apparatus further includes:
and the credit granting grade output module is used for inputting the credit score of the target application client into a preset client credit grade standard model and outputting the credit granting grade of the target application client.
Optionally, in another implementation manner of the second aspect of the present invention, the credit scoring apparatus further includes:
and the execution interest rate level output module is used for inputting the credit rating of the target application client into a preset rating standard execution interest rate level model and outputting the execution interest rate level of the target application client.
Optionally, in another implementation manner of the second aspect of the present invention, the apparatus further includes:
the original data sampling module is used for sampling and acquiring client data of known credit types from a preset database to form an original data sample;
the preset standardized parameter acquisition module is used for processing variables in the original data sample according to a data standardized processing mode to generate corresponding standardized samples, outputting and recording standardized parameters of each variable, and the preset standardized parameters of the variables are standardized parameters of each variable;
the preset characteristic variable acquisition module is used for extracting the characteristic variables of the standardized samples according to a preset extraction model to generate corresponding training samples, outputting and recording the selected characteristic variables, and the preset characteristic variables are the selected characteristic variables;
and the model training module is used for inputting the training samples into a preset probability neural network model for model training to obtain the preset expert group decision model.
Optionally, in another implementation manner of the second aspect of the present invention, the data normalization processing manner includes variable virtualization processing of attribute data and dimensionless processing of numerical data;
the preset extraction model is a partial least square regression model.
Optionally, in another implementation manner of the second aspect of the present invention, the model training module specifically includes:
the resampling unit is used for resampling the training samples by using a Bootstrap algorithm to form a plurality of corresponding calculation samples;
and the target probability neural network model acquisition unit is used for inputting each calculation sample into a preset probability neural network model for model training to obtain a plurality of corresponding target probability neural network models, and the preset expert group decision model is a set of the target probability neural network models.
Optionally, in another implementation manner of the second aspect of the present invention, the conservative probability prediction module specifically includes:
a target conservative probability obtaining unit, configured to input the normalized data into each target probabilistic neural network model, and output a plurality of corresponding target conservative probabilities;
and the weighted average processing unit is used for solving a weighted average value of all the target conservation probabilities to obtain the conservation probabilities corresponding to the target application clients.
A third aspect of the present invention provides a credit scoring apparatus comprising: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line; the at least one processor invokes the instructions in the memory to cause the credit scoring apparatus to perform the method of the first aspect.
A fourth aspect of the present invention provides a computer-readable storage medium having stored therein instructions which, when run on a computer, cause the computer to perform the method of the first aspect described above.
According to the technical scheme provided by the invention, target application client data is obtained; screening characteristic variables of the target application client data according to preset characteristic variables; processing the screened characteristic variables according to preset variable standardization parameters in a data standardization processing mode to obtain corresponding standardized data; inputting the standardized data into a preset expert group decision model for prediction, and outputting the corresponding conservation probability of the target application client; and inputting the conservation probability and the historical credit record of the target application client into a preset credit scoring model, and outputting the credit score of the target application client. The invention solves the problems that the existing credit scoring model has complex credit evaluation on new customers and low credit scoring accuracy, realizes automatic scoring of personal credit and improves the accuracy of the financial institution in evaluating the customer credit.
Drawings
FIG. 1 is a diagram illustrating an embodiment of a credit scoring method according to an embodiment of the present invention;
FIG. 2 is a diagram of another embodiment of a credit scoring method according to an embodiment of the present invention;
FIG. 3 is a diagram of an embodiment of a credit scoring device according to an embodiment of the invention;
FIG. 4 is a schematic diagram of another embodiment of a credit scoring device according to an embodiment of the present invention;
fig. 5 is a schematic diagram of an embodiment of a credit scoring device in an embodiment of the invention.
Detailed Description
The embodiment of the invention provides a credit scoring method, a device, equipment and a storage medium, which are used for automatically scoring personal credit and improving the accuracy of a financial institution in evaluating the credit of a client.
In order to make the technical field of the invention better understand the scheme of the invention, the embodiment of the invention will be described in conjunction with the attached drawings in the embodiment of the invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
For understanding, a detailed flow of an embodiment of the present invention is described below, and referring to fig. 1, an embodiment of a credit scoring method according to an embodiment of the present invention includes:
101. and acquiring target application client data.
Specifically, the server acquires client data of a target application client to be credit scored, and the client data which can be acquired for an individual client comprises income, assets, age, occupation, residence, historical credit records and the like. Before credit scoring is carried out on a new application client, corresponding client data needs to be acquired, and subsequent data processing is facilitated.
102. And screening the characteristic variables of the target application client data according to preset characteristic variables.
Specifically, the server performs characteristic variable screening on the target application client data according to preset characteristic variables. Step 101 shows that there are many variables in the target application client data, and in order to reduce the dimension of the original data space and save the calculation amount of credit score, the factors interfering with the prediction of the credit score model need to be eliminated.
103. And processing the screened characteristic variables according to the preset variable standardization parameters in a data standardization processing mode to obtain corresponding standardized data.
Specifically, the server processes the feature variables screened in step 102 according to the preset variable standardization parameters selected in the model training phase in a data standardization processing manner to obtain corresponding standardization data. In a specific embodiment, the data normalization processing method of the present invention includes variable virtualization processing of attribute data and dimensionless processing of numerical data. Due to the fact that the target application client data comprises attribute variables and numerical variables, such as occupation and income; therefore, the selected characteristic variables may include attribute variables (including dependent variables) and may also include numerical variables, and the attribute data in the selected characteristic variables needs to be subjected to variable virtualization, that is, represented by virtual variables, and in order to avoid "virtual variable traps", the number of the virtual variables corresponding to each variable in the present invention is one less than the number of classes of the attribute variables.
Furthermore, a non-dimensionalization processing mode is adopted for the numerical variables, abnormal data are removed firstly, and then the numerical variables are normalized. The numerical variables are subjected to non-dimensionalization processing, so that instability of the numerical variables in model training and testing processes caused by inconsistent dimensions is avoided. Taking variable X as an example, it is normalized to' X:
Figure RE-GDA0002886589610000061
after the variables are subjected to data standardization processing, the classification advantages of the attribute variables can be fully utilized, and the weight advantages of the numerical variables can be utilized.
104. And inputting the standardized data into a preset expert group decision model for prediction, and outputting the corresponding conservation probability of the target application client.
Specifically, the server inputs the obtained standardized data into a preset expert group decision model for prediction, and outputs the corresponding conservation probability of the target application client. The preset expert group decision model obtained in the model training stage is a set of a plurality of target probability neural network models. Therefore, when the invention tests the conservation probability of the target application client, a plurality of target conservation probabilities are correspondingly obtained, and the corresponding conservation probability of the target application client can be obtained only by carrying out weighted average processing. In specific implementation, the standardized data is input into each target probability neural network model, and a plurality of corresponding target conservation probabilities are output. And solving a weighted average of all the target conservation probabilities to obtain the corresponding conservation probability of the target application client. By means of weighted average processing, the method and the device have the advantages that the conservative probability of the target client is obtained more accurately in prediction, and the accuracy of model prediction data is greatly improved.
105. And inputting the conservation probability and the historical credit record of the target application client into a preset credit scoring model, and outputting the credit score of the target application client.
Specifically, the server inputs the conservative probability and the historical credit record of the target application client into a preset credit score model, and outputs the credit score of the target application client. The preset credit scoring model of the invention can be designed by taking the conservation probability and the historical credit record as input and the credit score as output according to the actual situation, and is not limited herein.
It can be seen that the credit score is a risk quantification model, which uses the observable characteristic variables of the borrower, i.e., the target applicant client, to calculate a value, i.e., the credit score, to represent the credit risk of the debtor, and also to classify the borrower into different risk classes and indicate the probability of default.
Further, in another embodiment of the present invention, the credit scoring method further includes:
and inputting the credit score of the target application client into a preset client credit grade standard model, and outputting the credit granting grade of the target application client.
When the method is specifically implemented, the server inputs the credit score of the target application client into a preset client credit level standard model, and can output the credit rating of the target application client. The preset customer credit rating standard model may divide the credit score into a plurality of different ranges according to actual conditions, and set different credit ratings correspondingly, that is, the credit score is used as an input value, and the credit rating is used as an output value to perform model design, which is not limited herein.
Further, in another embodiment of the present invention, the credit scoring method further includes:
and inputting the credit granting grade of the target application client into a preset grade standard execution interest rate level model, and outputting the execution interest rate level of the target application client.
In specific implementation, the server can also input the credit granting grade of the target application client into a preset grade standard execution interest rate level model, so that the execution interest rate level of the target application client is output and obtained. The level standard execution interest rate level model may set different execution interest rate levels corresponding to different credit granting levels according to actual situations, that is, model design is performed with the credit granting level as an input value and the execution interest rate level as an output value, which is not limited herein.
Therefore, the credit scoring method embodiment of the invention can directly output the credit score, the credit level and the corresponding execution interest rate level of the target application client, so that the financial institution is clear at a glance on the credit condition of the client, and the accuracy of the financial institution in evaluating the client credit is improved.
Further, in another embodiment of the present invention, before step 101, a model training phase is further included, referring to fig. 2, the credit scoring method further includes:
201. and sampling customer data of known credit categories from a preset database to form an original data sample.
Specifically, in the model training phase, the server samples and acquires client data of known credit categories from a preset database in advance to form an original data sample. In the specific implementation of the present invention, the number in the preset database includes, but is not limited to, a plain text file, an Excel file, an SAS dataset and related databases and shared resources, an SPSS dataset and related databases and shared resources, a Matlab data file, and various custom data files, the client data is history data, and the credit score, credit rating, and execution interest rate level of the client in the history data are known.
202. And processing the variables in the original data sample according to a data standardization processing mode to generate a corresponding standardized sample, outputting and recording the standardized parameter of each variable, wherein the preset standardized parameter of each variable is the standardized parameter of each variable.
Specifically, the server further processes the variables in the sampled original data sample, where the variables are the variables in the client data, according to a data normalization processing manner to generate a corresponding normalized sample, and in a specific implementation, the data normalization processing manner is the same as that in step 103, and is not described herein again. And outputting and correspondingly recording the standardized parameter of each variable in the standardization processing process, wherein the obtained standardized parameter of each variable is the preset variable standardized parameter.
203. And extracting characteristic variables of the standardized samples according to a preset extraction model to generate corresponding training samples, outputting and recording the selected characteristic variables, wherein the preset characteristic variables are the selected characteristic variables.
Specifically, the server performs feature variable extraction on the obtained standardized sample according to a preset extraction model, generates a corresponding training sample, outputs and records the selected feature variable, where the obtained selected feature variable is the preset feature variable in step 102.
In the model training process, the collected historical client credit information samples are data sets composed of high-dimensional characteristic variables, the characteristic variables not only contain factors which do not contribute much to credit evaluation or even have negative effects, but also cause dimension disasters due to the complexity of high-dimensional problems. The purpose of extracting the characteristic variables is to reduce the dimension of an original characteristic space by using a specific extraction method, so that the calculation amount of credit evaluation is saved, and factors interfering with the evaluation are eliminated, so that the efficiency and the accuracy of the client credit evaluation method are improved.
Further, in the specific implementation of the present invention, the preset extraction model is a partial least squares regression model. Specifically, since the partial least square regression model is easier to identify system information and noise, or identify a non-random noise data model, and the regression coefficient of each independent variable is easier to be interpreted by the partial least square regression model, the present invention extracts the characteristic variable using the partial least square regression model, and the characteristic variable extraction mode is forward.
204. And inputting the training samples into a preset probabilistic neural network model for model training to obtain the preset expert group decision model.
Further, the server inputs the obtained training samples into a preset probabilistic neural network model for model training to obtain the preset expert group decision model.
Further, in the implementation of the present invention, step 204 specifically includes:
and repeatedly sampling the training samples by using a Bootstrap algorithm to form a plurality of corresponding calculation samples.
And inputting each calculation sample into a preset probability neural network model for model training to obtain a plurality of corresponding target probability neural network models, wherein the preset expert group decision model is a set of the target probability neural network models.
Specifically, because training sample data is limited, in order to enable a preset expert group decision model to obtain a more stable unbiased estimation result, Bootstrap self-help resampling is performed on the selected training samples to form a plurality of required high-capacity calculation samples, then each calculation sample is trained by using a probabilistic neural network to obtain a plurality of corresponding target probabilistic neural network models, which are called as "experts", so that each calculation sample can obtain one "expert" to form an "expert group", namely, a preset expert group decision model.
Further, in the specific implementation of the present invention, the probabilistic neural network model in each of the above experts performs pattern classification by using a probabilistic neural network, and determines the class C to which the n-dimensional feature vector X belongsiIn which C isiTaking the value of 0 or 1, assuming that the probability density function f of each class is knowni(X) according to the Bayesian classification rule, if the following inequality holds X, it will be classified into CiClass (c):
PiLifi(x)>PjLjfj(X), i ≠ j; wherein P isiIs X belongs to CiPrior probability of class, LiIs CiThe cost of misclassification of classes.
Layer 1 of the probabilistic neural network structure is the input layer, which passes input samples to the next layer completely unchanged. The layer 2 of the probabilistic neural network structure is a mode layer, each mode unit has the same input, the number of the normal mode units is equal to the number of training samples, and the output of each mode unit in the layer is as follows:
Figure RE-GDA0002886589610000101
wherein, WiFor the weight of the input layer to mode layer connection, δ is a smoothing factor, which plays a crucial role in classification.
The 3 rd layer of the probabilistic neural network structure is an accumulation layer, and the probability accumulation belonging to a certain class is calculated according to the following formula:
Figure RE-GDA0002886589610000102
wherein k isiTo belong to class CiNumber of pattern samples, XijTo belong to class CiSo as to obtain the input sample belonging to CiThe maximum likelihood of a class, typically the number of summing layer elements, is equal to the number of classes.
The 4 th layer of the probabilistic neural network structure is a decision layer and has the function of realizing the formula PiLifi(x)>PjLjfj(X), i ≠ j. The probability neural network has the characteristic of Bayes posterior probability output, and for the traditional probability neural network, the weights of the traditional probability neural network do not need to be trained, and the weights of the input layer and the mode layer are set as various training samples. The most important advantage of probabilistic neural networks in model operation is that training is easy and instantaneous to complete.
With reference to fig. 3, the credit scoring method in the embodiment of the present invention is described above, and a credit scoring apparatus in the embodiment of the present invention is described below, where an embodiment of the credit scoring apparatus in the embodiment of the present invention includes:
and a target client data obtaining module 301, configured to obtain target application client data.
And the characteristic variable screening module 302 is configured to perform characteristic variable screening on the target application client data according to a preset characteristic variable.
And the standardized data acquisition module 303 is configured to process the screened feature variables according to a data standardization processing mode according to the preset variable standardized parameters to obtain corresponding standardized data.
And a conservation probability prediction module 304, configured to input the standardized data into a preset expert group decision model for prediction, and output a conservation probability corresponding to the target application client.
And a credit score output module 305, configured to input the conservative probability and the historical credit record of the target applicant client into a preset credit score model, and output the credit score of the target applicant client.
Optionally, in another implementation manner of the credit scoring apparatus of the present invention, the credit scoring apparatus further includes:
and the credit granting grade output module is used for inputting the credit score of the target application client into a preset client credit grade standard model and outputting the credit granting grade of the target application client.
Optionally, in another implementation manner of the credit scoring apparatus of the present invention, the credit scoring apparatus further includes:
and the execution interest rate level output module is used for inputting the credit rating of the target application client into a preset rating standard execution interest rate level model and outputting the execution interest rate level of the target application client.
Optionally, in another implementation manner of the credit scoring apparatus of the present invention, as shown in fig. 4, the credit scoring apparatus includes:
and a raw data sampling module 401, configured to sample and acquire client data of a known credit category from a preset database to form a raw data sample.
A preset standardized parameter obtaining module 402, configured to process the variables in the original data sample according to a data standardization processing mode to generate corresponding standardized samples, and output and record a standardized parameter of each variable, where the preset standardized parameter is a standardized parameter of each variable.
A preset feature variable obtaining module 403, configured to perform feature variable extraction on the standardized sample according to a preset extraction model to generate a corresponding training sample, and output and record the selected feature variable, where the preset feature variable is the selected feature variable.
And the model training module 404 is configured to input the training sample into a preset probabilistic neural network model for model training, so as to obtain the preset expert group decision model.
Optionally, in another implementation manner of the credit scoring device according to the present invention, the data normalization processing manner includes variable virtualization processing of attribute data and dimensionless processing of numerical data.
The preset extraction model is a partial least square regression model.
Optionally, in another implementation manner of the credit scoring apparatus of the present invention, the model training module 404 specifically includes:
and the resampling unit is used for resampling the training samples by using a Bootstrap algorithm to form a plurality of corresponding calculation samples.
And the target probability neural network model acquisition unit is used for inputting each calculation sample into a preset probability neural network model for model training to obtain a plurality of corresponding target probability neural network models, and the preset expert group decision model is a set of the target probability neural network models.
Optionally, in another implementation manner of the credit scoring apparatus of the present invention, the conservative probability prediction module 304 specifically includes:
and the target conservation probability acquisition unit is used for inputting the standardized data into each target probability neural network model and outputting a plurality of corresponding target conservation probabilities.
And the weighted average processing unit is used for solving a weighted average value of all the target conservation probabilities to obtain the conservation probabilities corresponding to the target application clients.
The credit scoring device in the embodiment of the present invention is described in detail in the above fig. 3 and fig. 4 from the perspective of the modular functional entity, and the credit scoring apparatus in the embodiment of the present invention is described in detail in the following from the perspective of hardware processing.
Fig. 5 is a schematic structural diagram of a credit scoring apparatus 500 according to an embodiment of the present invention, where the credit scoring apparatus 500 may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 501 (e.g., one or more processors) and a memory 509, and one or more storage media 508 (e.g., one or more mass storage devices) storing an application 507 or data 506. Memory 509 and storage medium 508 may be, among other things, transient storage or persistent storage. The program stored on the storage medium 508 may include one or more modules (not shown), each of which may include a series of instructions operating on the credit score. Still further, processor 501 may be configured to communicate with storage medium 508 to execute a series of instruction operations in storage medium 508 on credit scoring device 500.
Credit scoring apparatus 500 may also include one or more power supplies 502, one or more wired or wireless network interfaces 503, one or more input-output interfaces 504, and/or one or more operating systems 505, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, and so forth. Those skilled in the art will appreciate that the configuration of the credit scoring device shown in fig. 5 does not constitute a limitation of the credit scoring device and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A credit scoring method, comprising:
acquiring target application client data;
screening characteristic variables of the target application client data according to preset characteristic variables;
processing the screened characteristic variables according to preset variable standardization parameters in a data standardization processing mode to obtain corresponding standardized data;
inputting the standardized data into a preset expert group decision model for prediction, and outputting the corresponding conservation probability of the target application client;
and inputting the conservation probability and the historical credit record of the target application client into a preset credit scoring model, and outputting the credit score of the target application client.
2. A credit scoring method according to claim 1, further comprising:
and inputting the credit score of the target application client into a preset client credit grade standard model, and outputting the credit granting grade of the target application client.
3. The credit scoring method of claim 2, further comprising:
and inputting the credit granting grade of the target application client into a preset grade standard execution interest rate level model, and outputting the execution interest rate level of the target application client.
4. The credit scoring method of claim 1, wherein prior to obtaining the target application customer data, the method further comprises:
sampling and acquiring client data of known credit categories from a preset database to form an original data sample;
processing variables in the original data samples according to a data standardization processing mode to generate corresponding standardized samples, outputting and recording standardized parameters of each variable, wherein the preset standardized parameters of the variables are standardized parameters of each variable;
extracting characteristic variables of the standardized samples according to a preset extraction model to generate corresponding training samples, outputting and recording the selected characteristic variables, wherein the preset characteristic variables are the selected characteristic variables;
and inputting the training samples into a preset probabilistic neural network model for model training to obtain the preset expert group decision model.
5. The credit scoring method according to claim 4, wherein the data normalization processing means includes variable virtualization processing of attribute data and dimensionless processing of numerical data;
the preset extraction model is a partial least square regression model.
6. The method according to claim 4, wherein the inputting the training samples into a preset probabilistic neural network model for model training specifically comprises:
using a Bootstrap algorithm to repeatedly sample the training samples to form a plurality of corresponding calculation samples;
and inputting each calculation sample into a preset probability neural network model for model training to obtain a plurality of corresponding target probability neural network models, wherein the preset expert group decision model is a set of the target probability neural network models.
7. The method according to any one of claims 1 to 6, wherein the inputting the standardized data into a pre-configured expert group decision model for prediction and the outputting the corresponding conservation probability of the target applicant client specifically comprises:
inputting the standardized data into each target probability neural network model, and outputting a plurality of corresponding target conservation probabilities;
and solving a weighted average of all the target conservation probabilities to obtain the corresponding conservation probability of the target application client.
8. A credit scoring apparatus, comprising:
the target client data acquisition module is used for acquiring target application client data;
the characteristic variable screening module is used for screening the characteristic variables of the target application client data according to preset characteristic variables;
the standardized data acquisition module is used for processing the screened characteristic variables according to preset variable standardized parameters in a data standardized processing mode to obtain corresponding standardized data;
the conservation probability prediction module is used for inputting the standardized data into a preset expert group decision model for prediction and outputting the conservation probability corresponding to the target application client;
and the credit score output module is used for inputting the conservation probability and the historical credit record of the target application client into a preset credit score model and outputting the credit score of the target application client.
9. A credit scoring device, characterized in that the credit scoring device comprises: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line;
the at least one processor invokes the instructions in the memory to cause the credit scoring device to perform the method of any one of claims 1-7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202011081209.7A 2020-10-09 2020-10-09 Credit scoring method, device, equipment and storage medium Pending CN112348685A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011081209.7A CN112348685A (en) 2020-10-09 2020-10-09 Credit scoring method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011081209.7A CN112348685A (en) 2020-10-09 2020-10-09 Credit scoring method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112348685A true CN112348685A (en) 2021-02-09

Family

ID=74361750

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011081209.7A Pending CN112348685A (en) 2020-10-09 2020-10-09 Credit scoring method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112348685A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113362159A (en) * 2021-06-07 2021-09-07 中国工商银行股份有限公司 Method, device and equipment for determining user credit
CN117909378A (en) * 2023-12-20 2024-04-19 常州德汇智能化工程有限公司 Mining computing system based on big data

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106408413A (en) * 2016-09-23 2017-02-15 快睿登信息科技(上海)有限公司 Multi-cycle installment decision making method and system
CN108564286A (en) * 2018-04-19 2018-09-21 天合泽泰(厦门)征信服务有限公司 A kind of artificial intelligence finance air control credit assessment method and system based on big data reference
CN108737138A (en) * 2017-04-18 2018-11-02 腾讯科技(深圳)有限公司 Service providing method and service platform
CN110264330A (en) * 2018-03-13 2019-09-20 腾讯科技(深圳)有限公司 Credit index calculating method, device, computer readable storage medium
US20200097850A1 (en) * 2018-09-20 2020-03-26 Electronics And Telecommunications Research Institute Machine learning apparatus and method based on multi-feature extraction and transfer learning, and leak detection apparatus using the same
CN111507831A (en) * 2020-05-29 2020-08-07 长安汽车金融有限公司 Credit risk automatic assessment method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106408413A (en) * 2016-09-23 2017-02-15 快睿登信息科技(上海)有限公司 Multi-cycle installment decision making method and system
CN108737138A (en) * 2017-04-18 2018-11-02 腾讯科技(深圳)有限公司 Service providing method and service platform
CN110264330A (en) * 2018-03-13 2019-09-20 腾讯科技(深圳)有限公司 Credit index calculating method, device, computer readable storage medium
CN108564286A (en) * 2018-04-19 2018-09-21 天合泽泰(厦门)征信服务有限公司 A kind of artificial intelligence finance air control credit assessment method and system based on big data reference
US20200097850A1 (en) * 2018-09-20 2020-03-26 Electronics And Telecommunications Research Institute Machine learning apparatus and method based on multi-feature extraction and transfer learning, and leak detection apparatus using the same
CN111507831A (en) * 2020-05-29 2020-08-07 长安汽车金融有限公司 Credit risk automatic assessment method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
何晖: "《现代信号检测技术与评估理论的应用与研究》", 31 August 2018 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113362159A (en) * 2021-06-07 2021-09-07 中国工商银行股份有限公司 Method, device and equipment for determining user credit
CN117909378A (en) * 2023-12-20 2024-04-19 常州德汇智能化工程有限公司 Mining computing system based on big data

Similar Documents

Publication Publication Date Title
CN110363387B (en) Portrait analysis method and device based on big data, computer equipment and storage medium
US20070255646A1 (en) Methods and Systems for Multi-Credit Reporting Agency Data Modeling
US10789225B2 (en) Column weight calculation for data deduplication
WO2003096237A2 (en) Electronic data processing system and method of using an electronic data processing system for automatically determining a risk indicator value
CN110503566B (en) Wind control model building method and device, computer equipment and storage medium
CN112348685A (en) Credit scoring method, device, equipment and storage medium
CN110929525A (en) Network loan risk behavior analysis and detection method, device, equipment and storage medium
CN108197795B (en) Malicious group account identification method, device, terminal and storage medium
CN113688906A (en) Customer segmentation method and system based on quantum K-means algorithm
CN115146890A (en) Enterprise operation risk warning method and device, computer equipment and storage medium
Koç et al. Consumer loans' first payment default detection: a predictive model
CN114722941A (en) Credit default identification method, apparatus, device and medium
CN114626940A (en) Data analysis method and device and electronic equipment
CN113313582A (en) Guest refusing and reflashing model training method and device and electronic equipment
CN112529319A (en) Grading method and device based on multi-dimensional features, computer equipment and storage medium
CN112308294A (en) Default probability prediction method and device
CN113538132B (en) Credit scoring method, equipment and medium based on regression tree algorithm
Rodin Growing small businesses using software system for intellectual analysis of financial performance
US20230385664A1 (en) A computer-implemented method for deriving a data processing and inference pipeline
Koçoğlu Data Mining as a Knowledge Extraction Tool and an Application on Decision Tree-Based Algorithms
CN116188135A (en) Method, computing device, and storage medium for credit risk prediction
CN117172910A (en) Credit evaluation method and device based on EBM model, electronic equipment and storage medium
CN116821820A (en) False transaction identification method and device, electronic equipment and storage medium
CN115601070A (en) Method and device for determining prediction model, electronic device and storage medium
CN112115956A (en) Data processing method and data processing device for sample classification and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210209

RJ01 Rejection of invention patent application after publication