CN112529684A

CN112529684A - Customer credit assessment method and system based on FWA _ DBN

Info

Publication number: CN112529684A
Application number: CN202011351682.2A
Authority: CN
Inventors: 江远强
Original assignee: Baiweijinke Shanghai Information Technology Co ltd
Current assignee: Baiweijinke Shanghai Information Technology Co ltd
Priority date: 2020-11-27
Filing date: 2020-11-27
Publication date: 2021-03-19

Abstract

The invention relates to the technical field of wind control in the Internet financial industry, in particular to a customer credit evaluation method and a customer credit evaluation system based on FWA _ DBN, compared with the method adopting other shallow neural networks, the DBN is a deep and efficient learning algorithm, can extract deep-level features of data, realizes extraction and classification of high-dimensional nonlinear data features, improves the generalization ability and prediction accuracy of the network, gradually drives population optimization by a firework algorithm, balances global exploration ability and local search, solves a complex optimization problem, shows excellent performance and high efficiency, can process mass data and has high convergence speed by a DBN model optimized based on the FWA algorithm, the method has the advantages of global convergence and stable prediction, and is suitable for real-time evaluation of client credit of the Internet financial platform.

Description

Customer credit assessment method and system based on FWA _ DBN

Technical Field

The invention relates to the technical field of wind control in the Internet financial industry, in particular to a customer credit assessment method and system based on FWA _ DBN.

Background

With the rapid development of internet finance, credit assessment for customers is not limited to credit investigation reports, and more wind control models with big data are combined, the traditional assessment method mainly comprises machine learning methods based on logistic regression, support vector machines, random forests and the like, the theory of the methods is mature, the verification method is complete, the calculation process is simple and convenient, but applicable objects are single, and the prediction accuracy is not ideal. Artificial neural networks have proven to be good performing research models in recent years. At present, BP neural networks, RBF neural networks and Elman regression neural networks are mostly adopted in credit assessment application, but the problems of low convergence rate, easy falling into local minimum values and the like generally exist, more research focuses are turned to deep learning, compared with the traditional artificial intelligence method, the deep learning has stronger feature extraction capability, and deep complex association relations in data can be mined, so that the algorithm precision is improved.

The Deep Belief Network (DBN) is a Deep learning Network formed by stacking a plurality of Restricted Boltzmann Machines (RBMs), and the DBN model is optimized by training the RBMs layer by layer and using a training method of a BP neural Network for backward fine tuning, so that the extraction and classification of high-dimensional nonlinear data characteristics are realized, and the DBN is very suitable for fraud detection of mutual-fund platforms.

Although each layer of RBM of the DBN model depends on determining parameters layer by layer to train the network, in general, initial parameters are determined randomly and are easy to fall into local optimization in the training process, so that the convergence rate and the prediction accuracy of the model are influenced. How to adopt a more suitable intelligent algorithm to optimize the initial parameters of the DBN and apply the initial parameters to the client credit evaluation is a technical problem to be solved by those skilled in the art, and therefore, a client credit evaluation method and system based on FWA _ DBN are provided to solve the above problems.

Disclosure of Invention

The present invention is directed to a customer credit assessment method and system based on FWA _ DBN, so as to solve the problems mentioned in the background art.

In order to achieve the purpose, the invention provides the following technical scheme:

a customer credit assessment method and system based on FWA _ DBN includes the steps:

s1, sampling customers represented by existing loans as modeling samples, and collecting credit characteristic data of the customers;

s2, carrying out data preprocessing on the obtained modeling data, carrying out normalization processing on the preprocessed data by adopting a min-max method, and dividing a training set and a test set according to a preset proportion;

s3, preliminarily determining the structure of the DBN according to the training data characteristics, and initializing relevant parameters of the DBN, wherein the steps comprise: the method comprises the following steps of inputting nodes, outputting nodes, the maximum layer number, the number of nodes in each layer and the maximum iteration number;

s4, training the DBN by using a training set, and optimizing network model parameters by using an FWA algorithm to obtain an FWA-DBN prediction model;

s5, importing the verification set into the FWA-DBN for testing, and repeating the step S3 and the step S4 to train the FWA-DBN prediction model again if the testing precision does not meet the requirement of a preset threshold;

and S6, deploying the client credit evaluation model of the FWA-DBN to a loan application platform to output a real-time application credit score, realizing real-time examination and approval of an application client, inputting the performance data to the model for training regularly, and realizing online updating of the model.

Preferably, in S1, the client of the existing loan performance is sampled as a modeling sample, and the credit feature data of the client is collected, wherein the credit feature data comprises personal basic information, operation behavior buried point data and third party data.

Preferably, in S2, for the missing data, by using interpolation, median completion is inserted to make the data samples consistent; the original variable needs to be normalized for original data due to the influence on the training of the DBN caused by different dimensions and too large numerical difference, and a normalization function formula used in the patent is as follows:

wherein, y_maxDefault to 1, y_minThe result is that the product is considered to be-1 by default,x being the original data item currently to be processed, x_maxFor the maximum of all the original data items to be processed, x_minIs the minimum of all the raw data items to be processed.

And (3) carrying out normalization on the data set according to the application time in proportion of 7: 3 into training and testing sets.

Preferably, in S3, the DBN model is constructed according to the training set, and the neural network parameters are initialized.

The deep belief network is a probability generation model and is formed by sequentially overlapping a plurality of Restricted Boltzmann Machines (RBMs), wherein the bottommost layer of the deep belief network receives input data vectors and converts the input data to a hidden layer through the RBMs, namely the input of a higher layer of RBMs is from the output of a lower layer of RBMs.

Step 3-1: establishing an energy function

The RBM is a bidirectional recurrent neural network consisting of a visual layer and a hidden layer, wherein the visual layer is used for receiving input by a layer of display unit V representing input, and the hidden layer is used for extracting features by a layer of hidden unit h representing hidden variables. The visual layer unit of the RBM is v ═ v₁,v₂,v₃,…,v_IThe element belongs to {0,1}, and the hidden layer unit is h ═ h₁,h₂,h₃,…,h_IThe ∈ {0,1}, the weight matrix is w, the threshold of the visible layer element is a, and the threshold of the hidden layer element is b, then the energy function of the joint state (v, h) of all visible and hidden layer elements:

wherein, w_ijConnection weights for the ith visual layer and the jth hidden unit, v_i、h_jHidden layer vectors, c, for respectively a visible layer element i and a hidden layer element j_i、b_jThe threshold values of the visible layer unit I and the hidden layer unit J are respectively, I is the number of the visible layer units, and J is the number of the hidden layer units.

Step 3-2: joint probability distribution

The joint probability distribution between the hidden layer and the visible layer is obtained according to the energy function E (v, h) obtained by the above formula:

where Z is a normalization constant of the simulated physical system, which is obtained by summing the energy values between all visible and hidden layer elements.

Step 3-3: independent distribution

Through the joint probability distribution, the independent distribution of the visual layer vector v is obtained as follows:

the probability of the hidden layer vector h given a random input visual layer vector v is then:

probability of the visual layer vector v given a random input hidden layer vector h:

since the structure unit of RBM is a binary state, a logic function sigmoid activating function is recorded as

Step 3-4: probability of activation

According to the structure and state probability of the RBM, when the state of each visible layer unit is given, the states of each unit of the hidden layer are independent; similarly, when the state of each hidden layer unit is given, the states of each unit of the visible layer are independent; the activation probabilities of the visible layer v and the hidden layer h are respectively as follows:

step 3-5: adjusting and optimizing weight and threshold value by gradient descent method

And solving an error value between the actual value and the expected value by adopting a gradient descent method:

where e (t) is the error of t iterations, z (t), and y (t) are the expected output value and the actual output value of t iterations, respectively.

And then, calculating the gradient of the weight according to the error, and carrying out tuning along the descending direction of the gradient:

where μ is the learning rate, E (t) is the error of t iterations, W_ij(t+1)、b_jAnd (t +1) is the weight value and the threshold value after the adjustment and the optimization respectively.

When all RBM network structures are trained, a plurality of RBM network structures are stacked into a deep belief network.

In S3, a DBN classification network model is established, and network pre-training is performed using unlabeled pre-training data samples, and parameter fine-tuning is performed using labeled samples, which includes the following steps:

(1) inputting normalized pre-training sample data, setting network structure parameters, randomly initializing weights between network layers, and initially setting a threshold value of each layer to zero;

(2) training RBM layers layer by using pre-training sample data, wherein the output of each RBM layer is used as the input of the next layer until the training is finished, and obtaining the network weight and the threshold of each layer;

(3) and (3) using the network parameters obtained by the training in the step (2) as initial values, using the data samples with labels, expanding the DBN into a BP network structure, adding a classification layer to the highest layer of the network to serve as a final classification judgment layer for network feature output, comparing the obtained result with the labels of the input labeled data, and using the obtained error data to perform error back propagation to fine-tune the parameters of the whole network.

Preferably, in S4, the network thresholds of the visible layer and the hidden layer of the RBM model are b and c, respectively, the connection weights of the two layers are W, and the DBN parameter training may convert the problem into a log-likelihood function maximization problem of the RBM on the training set in order to achieve an optimal solution of the parameter θ ═ { W, b, c }. This patent uses the FWA algorithm to optimize the network model parameters.

Firework Algorithm (FWA) is an efficient group intelligent optimization algorithm that simulates group optimization with a firework explosion process. Each firework individual represents a feasible solution, offspring sparks are generated through a specific explosion strategy, and an optimal solution is selected to be a candidate solution of the next generation in a certain mode, so that population optimization is driven step by step, the global exploration capability and the local search capability are balanced, and the excellent performance and the high efficiency are shown in the process of solving a complex optimization problem.

The firework algorithm mainly comprises an explosion operator, a mutation operator, a mapping rule and a selection strategy 4, wherein the explosion operator mainly comprises an explosion radius, an explosion spark number, explosion intensity and the like; the mutation operator generally selects Gaussian mutation, and the mapping rule mainly comprises modular arithmetic mapping and random mapping; the selection strategy mainly comprises the selection of distance and random selection.

Step 4-1: initial population

The initial population number n, the spark limit parameter m, the initial iteration number t equal to 0, and the maximum iteration number t_max. The objects of firework algorithm optimization are a connection weight W, a visible layer threshold b and a hidden layer c randomly generated by RBMs in the DBN, so that for a given RBM structure, the connection weight and the connection threshold are directly arranged, and a firework individual in a population can be represented as [ W₁,W₂,···,W_n1,b₁,b₂,···,b_n2,c₁,c₂,···,Cn₂]。

Step 4-2: fitness function

The aim of the algorithm model training is to make the result of the network output layer and the expected result as close as possible by continuous iterative computation, so as to obtain the weight and threshold between each neuron node when the network output result is optimal. Because the fitness function of the firework algorithm is related to the total error function of the neural network, the total error function is introduced into the firework algorithm to calculate the fitness value of each firework, and therefore the expression of the fitness function fit is as follows:

wherein E represents the total error of the neural network, K is the number of data samples, q is the number of hidden layer neurons, d is the expected output of the network, and y is the actual output of the network. The fitness function can intuitively reflect the error magnitude of actual output and expected output, and if the error is smaller, the fitness is better.

Step 4-3: explosion operator

The explosion operator mainly comprises explosion radius, explosion spark number, explosion intensity and the like, and each firework is calculatedIs of the fitness value f (x)_i) Randomly taking a random number U (-1,1) and setting the number of sparks S_iThen, the explosion radius R of each firework is calculated respectively_iNumber of sparks S_iAnd the explosion radius R of the fireworks_iThe calculation formula is as follows:

wherein i is 1,2, …, N is the total number of samples; y is_max,y_minRespectively representing the maximum value and the minimum value of the fitness value in the current group, and representing the best firework and the worst firework; f (x)_i) For fireworks x_iA fitness value of; m and R are constants which respectively limit the maximum explosion radius and the maximum number of sparks generated by explosion; ε is a very small constant used to avoid the 0 operation.

Step 4-4: mutation operator

The setting of the mutation operator is to increase the diversity of the explosive firework population, avoid falling into local extreme values, and realize through Gaussian variation. Suppose a Firework x is selected_iPerforming Gaussian mutation, the k-dimensional Gaussian mutation operation is:

wherein,

showing the gaussian variation sparks generated after the fireworks are varied,

indicating the position of the ith firework in the k dimension;

is the optimal individual X in the current population_BA position in the k-dimension; gaussian (1,1) is a Gaussian-distributed random number with both a mean and variance of 1.

And 4-5: mapping rules

Some fireworks may cover beyond the boundary of the feasible region after Gaussian variation, and sparks outside the feasible region are useless after explosion. The spark beyond the boundary is processed by mainly adopting a modular operation mapping rule, wherein the modular operation rule is calculated by the following formula:

wherein,

representing the position of the ith firework individual exceeding the boundary in the kth dimension;

and

respectively representing the lower boundary and the upper boundary of the fireworks on the k dimension; rand (0,1) is in the interval [0,1 ]]Uniformly distributed random numbers.

And 4-6: selection policy

The selection operation is to select a part of excellent fireworks and sparks in the feasible area as the fireworks for next explosion. And (4) selecting a circulation mode of the optimal individual after each explosion to further find out the optimal solution of the problem. Assuming that the candidate set is K and the firework population size is N, the individual with the smallest error in the candidate set can be deterministically selected to the next generation as the firework, the remaining N-1 fireworks are selected in the candidate set by using a roulette method, the error in the candidate set is measured by calculation according to the Euclidean distance method, and the sum R (x) of the distances between all individuals in the current individual candidate set is calculated_i) The calculation formula is as follows:

wherein R (x)_i) The sum of the distances among all individuals in the current individual candidate set is obtained; d (x)_i-x_j) And x_i-x_jAll represents the Euclidean distance between the ith individual and the jth individual; k is the candidate set.

In the calculation of R (x)_i) Then, the selection probability of each firework is determined by a roulette method, and the calculation formula is as follows:

and 4-7: determining a stop condition

If the stopping condition is met, jumping out of the program and outputting an optimal result; if not, returning to the step 4-3 to continue circulation; the stop condition is that the number of iterations t is reached_maxObtaining the population optimal individual X_BDecoding to obtain the optimal parameter space [ W ] of DBN_B,b_B,c_B]。

Preferably, in step 5, for comparison with the DBN model optimized by genetic algorithm, particle swarm algorithm, and ant colony algorithm, the present invention uses Mean Absolute Percentage Error (MAPE) and Root Mean Square Error (RMSE) as evaluation indexes, wherein:

where n is the number of output training samples, i represents the serial number of the output training samples, y_iIs the predicted value of the ith output training sample,

is the actual value of the ith output training sample.

Smaller values for MAPE and RMSE represent higher prediction accuracy for the model, but these two indices are relative values and need to be compared under the same data to make sense.

Preferably, in step 6, a credit scoring model of the FWA _ DBN is deployed to a loan application platform to output a real-time application credit score, so that real-time approval of an application client is realized, and performance data is input into model training at regular intervals, so that online updating of the model is realized.

Preferably, a customer credit evaluation system based on FWA _ DBN is further provided, which includes the following modules:

a dataset acquisition and labeling module: the loan system back end is used for obtaining a training data set comprising application, repayment, operation and third-party data;

the characteristic data extraction and normalization processing module: the system comprises a data acquisition unit, a data processing unit and a data processing unit, wherein the data acquisition unit is used for acquiring data in a training data set, extracting characteristic data which is strongly associated with repayment performance based on the acquired data in the training data set, carrying out normalization processing on the characteristic data and segmenting the characteristic data into a training set and a test set;

the DBN model building module is used for building a DBN model according to a given training sample and the number of nodes of the hidden layer; automatically optimizing RBM network parameters theta ═ W, b, c in the DBN model by adopting an FWA algorithm,

the DBN training test module is used for training the optimized DBN by using a training set and verifying by using a test set to obtain the accuracy of model prediction;

a DBN prediction module: and the trained DBN network carries out credit risk grade evaluation prediction on the online application client.

Compared with the prior art, the invention has the beneficial effects that:

1. compared with other shallow neural networks, the DBN is a deep and efficient learning algorithm, can extract deep features of data, achieves high-dimensional nonlinear data feature extraction and classification, and improves generalization capability and prediction accuracy of the network.

2. Compared with optimization algorithms such as a genetic algorithm, a particle swarm algorithm, an ant colony algorithm and the like, the firework algorithm drives the population optimization step by step, balances the global exploration capability and the local search capability, and shows excellent performance and high efficiency in solving a complex optimization problem.

3. According to the invention, the DBN model optimized based on the FWA algorithm can process mass data, has the advantages of high convergence speed, global convergence and stable prediction, and is suitable for real-time evaluation of client credit of an internet financial platform.

Drawings

Fig. 1 is a schematic view of the overall structure of the present invention.

Detailed Description

Example 1:

referring to fig. 1, the present invention provides a technical solution:

In S1, the client of the existing loan expression is sampled to be used as a modeling sample, and credit characteristic data of the client is collected, wherein the credit characteristic data comprises personal basic information, operation behavior buried point data and third-party data, the arrangement is favorable for collecting user data for subsequent analysis,

in S2, for missing data, a median complement is inserted by using an interpolation method to keep the data samples consistent; the original variable needs to be normalized for original data due to the influence on the training of the DBN caused by different dimensions and too large numerical difference, and a normalization function formula used in the patent is as follows:

wherein, y_maxDefault to 1, y_minDefault to-1, x is the original data item currently being processed, x_maxFor the maximum of all the original data items to be processed, x_minIs the minimum of all the raw data items to be processed.

And (3) carrying out normalization on the data set according to the application time in proportion of 7: 3, the data are divided into a training set and a testing set, the arrangement is favorable for improving the stability of the data,

and S3, constructing a DBN model according to the training set, and initializing neural network parameters.

Step 3-1: establishing an energy function

RBM is a bidirectional recurrent neural network composed of a visual layer and a hidden layer, wherein the visual layer is composed of a display layer unit V representing input for receiving input and a hidden layerThe hidden layer is used for extracting features by a hidden unit h representing hidden variables. The visual layer unit of the RBM is v ═ v₁,v₂,v₃,…,v_IThe element belongs to {0,1}, and the hidden layer unit is h ═ h₁,h₂,h₃,…,h_IThe ∈ {0,1}, the weight matrix is w, the threshold of the visible layer element is a, and the threshold of the hidden layer element is b, then the energy function of the joint state (v, h) of all visible and hidden layer elements:

Step 3-2: joint probability distribution

Step 3-3: independent distribution

Step 3-4: probability of activation

wherein E (t) is an error of iteration t times, z (t), and y (t) are an expected output value and an actual output value respectively;

(3) using the network parameters obtained by the training in the step (2) as initial values, using the data samples with labels to expand the DBN into a BP network structure, adding a classification layer at the highest layer of the network as a final classification judgment layer for outputting network characteristics, comparing the obtained result with the labels of the input labeled data, using the obtained error data to perform error back propagation, fine-tuning the parameters of the whole network, wherein the setting is favorable for adjusting the parameters through errors,

in S4, the network thresholds of the visible layer and the hidden layer of the RBM model are b and c, respectively, the connection weights of the two layers are W, and the DBN parameter training can convert the problem into a log-likelihood function maximization problem for solving the RBM on the training set in order to achieve the optimal solution of the parameter θ ═ { W, b, c }. This patent uses the FWA algorithm to optimize the network model parameters.

Step 4-1: initial population

The initial population number n, the spark limiting parameter m, the initial iteration number t equal to 0 and the maximum iteration number. The objects of firework algorithm optimization are a connection weight W, a visible layer threshold b and a hidden layer c randomly generated by RBM in DBN, so that for a given RBM structure, the connection weight and the connection threshold are directly arranged, and a firework individual in a population can be represented as [ W₁,W₂,···,W_n1,b₁,b₂,···,b_n2,c₁,c₂,···,Cn₂]。

Step 4-2: fitness function

Step 4-3: explosion operator

The explosion operator mainly comprises an explosion radius, an explosion spark number, explosion intensity and the like, and calculates the fitness value f (x) of each firework_i) Randomly taking a random number U (-1,1) and setting the number of sparks S_iThen, the explosion radius R of each firework is calculated respectively_iNumber of sparks S_iAnd the explosion radius R of the fireworks_iThe calculation formula is as follows:

wherein i is 1,2, …, N is the total number of samples; y is_max,y_minRespectively representing the maximum value and the minimum value of the fitness value in the current group, and representing the best firework and the worst firework; f (x)_i) For fireworks x_iA fitness value of; m and R are constants which respectively limit the maximum explosion radius and the maximum number of sparks generated by explosion; epsilon is oneA very small constant to avoid operation with 0.

Step 4-4: mutation operator

wherein,

showing the gaussian variation sparks generated after the fireworks are varied,

indicating the position of the ith firework in the k dimension;

And 4-5: mapping rules

wherein,

and

And 4-6: selection policy

and 4-7: determining a stop condition

If the stopping condition is met, jumping out of the program and outputting an optimal result; if not, go back to Step211, continuing to circulate; the stop condition is that the number of iterations t is reached_maxObtaining the population optimal individual X_BDecoding to obtain the optimal parameter space [ W ] of DBN_B,b_B,c_B]This arrangement facilitates finding the optimal parameter space.

In step 5, for comparison with the DBN model optimized by genetic algorithm, particle swarm algorithm, and ant colony algorithm, the present invention adopts Mean Absolute Percentage Error (MAPE) and Root Mean Square Error (RMSE) as evaluation indexes, wherein:

is the actual value of the ith output training sample.

The smaller the values of MAPE and RMSE are, the higher the prediction accuracy of the model is, but the two indexes are relative values and need to be compared under the same data to make sense, the arrangement is favorable for visually expressing the prediction accuracy of the model,

in step 6, a credit scoring model of the FWA _ DBN is deployed to a loan application platform to output a real-time application credit score, so that real-time examination and approval of an application client are realized, performance data are input into the model for training regularly, online updating of the model is realized, the arrangement is favorable for continuous optimization of the model,

also provided is a FWA _ DBN-based customer credit evaluation system, comprising the following modules:

the DBN model building module is used for building a DBN model according to a given training sample and the number of nodes of the hidden layer; automatically optimizing RBM network parameters theta ═ W, b, c in the DBN model by adopting an FWA algorithm;

a DBN prediction module: the trained DBN network carries out credit risk grade evaluation prediction on the online application client, and the setting can predict the credit grade of the client in advance, so that the client can be provided with corresponding credit rights conveniently.

The working process is as follows: the invention comprises the following steps:

And collecting customer data as input through S1-S6, scoring the user credit risk by using the FWA-DBN model, and looping the step S3 and the step S4 to train the FWA-DBN prediction model again when the test precision does not meet the preset threshold requirement, and finally providing the user credit risk score with qualified precision.

The same parts of embodiment 2 as embodiment 1 are not repeated, except that: in S1, the client of the existing loan expression is sampled to be used as a modeling sample, and credit characteristic data of the client is collected, wherein the credit characteristic data comprises personal basic information, operation behavior buried point data and third-party data, the arrangement is favorable for collecting user data for subsequent analysis,

in step 3, a DBN model is constructed according to the training set, and neural network parameters are initialized

Step 3-1: establishing an energy function

The RBM is a bidirectional recurrent neural network consisting of a visual layer and a hidden layer, wherein the visual layer is used for receiving input by a visual layer v representing input, and the hidden layer is used for extracting features by a hidden unit h representing hidden variables. The RBM visual layer unit is v ═ v₁,v₂,v₃,…,v_IThe element belongs to {0,1}, and the hidden layer unit is h ═ h₁,h₂,h₃,…,h_IThe ∈ {0,1}, the weight matrix is w, the threshold of the visible layer element is a, and the threshold of the hidden layer element is b, then the energy function of the joint state (v, h) of all visible and hidden layer elements is:

wherein, w_ijConnection weights for the ith visual layer element and the jth hidden layer element, v_i、h_jHidden layer vectors, c, for respectively a visible layer element i and a hidden layer element j_i、b_jThe threshold values of the visible layer unit I and the hidden unit J are respectively, I is the number of the visible units, and J is the number of the hidden layer units.

Step 3-2: joint probability distribution

Step 3-3: independent distribution

Step 3-4: probability of activation

(2) training RBM layers layer by using pre-training sample data, wherein the output of each RBM layer is used as the input of the next layer until the training is finished, and obtaining the network weight and the threshold value of each layer;

in step 4, network thresholds of a visible layer and a hidden layer of the RBM model are b and c, respectively, a connection weight of the two layers is W, and DBN parameter training can convert the problem into a problem of solving the maximum log-likelihood function of the RBM on a training set in order to realize the optimal solution of a parameter θ ═ { W, b, c }. This patent uses the FWA algorithm to optimize the network model parameters.

Firework Algorithm (FWA) is an efficient group intelligent optimization algorithm that simulates group optimization with a firework explosion process. Each firework individual represents a feasible solution, offspring sparks are generated through a specific explosion strategy, and an optimal solution is selected to be a candidate solution of the next generation in a certain mode, so that population optimization is driven step by step, the global exploration capability and the local search capability are balanced, and the complex optimization problem is solved, and the excellent performance and the high efficiency are shown.

Step 4-1: initial population

Step 4-2: fitness function

The aim of the algorithm model training is to make the result of the network output layer and the expected result as close as possible by continuous iterative computation, so as to obtain the weight and threshold between each neuron node when the network output result is optimal. Because the fitness function of the firework algorithm is related to the total error function of the neural network, the total error function is introduced to calculate the fitness value of the firework individual, and therefore the expression of the fitness function fit is as follows:

Step 4-3: explosion operator

The explosion operator mainly comprises an explosion radius, an explosion spark number, explosion intensity and the like, and calculates the fitness value f (x) of each firework_i) Randomly taking a random number U (-1,1) and setting the number of sparks S_iThen, respectively calculating the explosion radius R and the number S of sparks of each firework_iAnd a detonation radius R_iThe calculation formula is as follows:

where i is 1,2, …, N is the total number of samples, y_max，y_minRespectively representing the maximum value and the minimum value of the fitness value in the current group, and representing the best firework and the worst firework; f (x)_i) For fireworks x_iA fitness value of; m and R are constants which respectively limit the maximum explosion radius and the maximum number of sparks generated by explosion; ε is a very small constant used to avoid the 0 operation.

Step 4-4: mutation operator

wherein,

showing the gaussian variation sparks generated after the fireworks are varied,

indicating the position of the ith firework in the k dimension;

And 4-5: mapping rules

wherein,

and

And 4-6: selection policy

and 4-7: determining a stop condition

If the stopping condition is met, jumping out of the program and outputting an optimal result; if not, returning to Step211 to continue circulation; the stop condition is that the number of iterations t is reached_maxObtaining the population optimal individual X_BDecoding to obtain the optimal parameter space [ W ] of DBN_B,b_B,c_B]This arrangement facilitates finding the best parameter space,

is the actual value of the ith output training sample.

in S6, deploying the credit scoring model of FWA _ DBN to a loan application platform to output real-time application credit scoring, realizing real-time examination and approval of the application client, inputting performance data to model training regularly, realizing online updating of the model, facilitating continuous optimization of the model,

Compared with other shallow neural networks, the DBN is a deep and efficient learning algorithm, can extract deep features of data, achieves extraction and classification of high-dimensional nonlinear data features, improves generalization capability and prediction accuracy of the network, and compared with optimization algorithms such as genetic algorithm, particle swarm algorithm and ant colony algorithm, the firework algorithm drives population optimization step by step, balances global exploration capability and local search capability, shows excellent performance and high efficiency in solving complex optimization problems, the DBN model based on FWA algorithm optimization can process massive data and has the advantages of high convergence speed, global convergence and stable prediction, and is suitable for real-time assessment of client credit of an Internet financial platform

The principles and embodiments of the present invention are explained herein using specific examples, which are presented only to assist in understanding the method and its core concepts of the present invention. The foregoing is only a preferred embodiment of the present invention, and it should be noted that there are objectively infinite specific structures due to the limited character expressions, and it will be apparent to those skilled in the art that a plurality of modifications, decorations or changes may be made without departing from the principle of the present invention, and the technical features described above may be combined in a suitable manner; such modifications, variations, combinations, or adaptations of the invention using its spirit and scope, as defined by the claims, may be directed to other uses and embodiments.

Claims

1. A customer credit assessment method and system based on FWA _ DBN is characterized in that: the method comprises the following steps:

2. The FWA _ DBN-based customer credit assessment method and system according to claim 1, wherein: at S1, the client of the existing loan expression is sampled as a modeling sample, and the credit characteristic data of the client is collected, wherein the credit characteristic data comprises personal basic information, operation behavior buried point data and third-party data.

3. The FWA _ DBN-based customer credit assessment method and system according to claim 1, wherein: in S2, for missing data, a median complement is inserted by using an interpolation method to keep the data samples consistent; the original variable needs to be normalized for original data due to the influence on the training of the DBN caused by different dimensions and too large numerical difference, and a normalization function formula used in the patent is as follows:

4. The FWA _ DBN-based client credit evaluation method and system of claim 1, wherein in S3, the neural network parameters are initialized during the construction of the DBN model according to the training set.

Step 3-1: establishing an energy function

The RBM is a bidirectional recurrent neural network consisting of a visual layer and a hidden layer, wherein the visual layer is used for receiving input by a layer of display unit V representing input, and the hidden layer is used for extracting features by a layer of hidden unit h representing hidden variables. The visual layer unit of the RBM is v ═ v₁,v₂,v₃,…,v_IThe element belongs to {0,1}, and the hidden layer unit is h ═ h₁,h₂,h₃,…,h_IThe ∈ {0,1}, the weight matrix is w, the threshold of the visible layer element is a, and the threshold of the hidden layer element is b, then the energy function of the joint state (v, h) of all visible and hidden layer elements is:

Step 3-2: joint probability distribution

The joint probability distribution between the visible layer and the hidden layer is obtained according to the energy function E (v, h) obtained by the above formula:

wherein Z is a normalization constant of the simulated physical system, and is obtained by adding energy values between all visible layer units and hidden layer units.

Step 3-3: independent distribution

Step 3-4: probability of activation

5. The FWA _ DBN-based customer credit evaluation method and system according to claim 1, wherein in S4, the network thresholds of the visible layer and the hidden layer of the RBM model are b and c, respectively, the connection weights of the two layers are W, and the DBN parameter training is performed to achieve the optimal solution of the parameter θ ═ { W, b, c }, so that the problem can be transformed into the log-likelihood function maximization problem of the RBM on the training set. This patent uses the FWA algorithm to optimize the network model parameters.

The firework algorithm mainly comprises 4 parts of an explosion operator, a mutation operator, a mapping rule, a selection strategy and the like, wherein the explosion operator mainly comprises an explosion radius, an explosion spark number, explosion intensity and the like; the mutation operator generally selects Gaussian mutation, and the mapping rule mainly comprises modular arithmetic mapping and random mapping; the selection strategy mainly comprises the selection of distance and random selection.

Step 4-1: initial population

Step 4-2: fitness function

Step 4-3: explosion operator

wherein i is 1,2, …, N is the total number of samples; y is_max，y_minRespectively representing the maximum value and the minimum value of the fitness value in the current group, and representing the best firework and the worst firework; f (x)_i) For fireworks x_iA fitness value of; m and R are constants which respectively limit the maximum explosion radius and the maximum number of sparks generated by explosion; ε is a very small constant used to avoid the 0 operation.

Step 4-4: mutation operator

wherein,

showing the gaussian variation sparks generated after the fireworks are varied,

indicating the position of the ith firework in the k dimension;

And 4-5: mapping rules

wherein,

and

And 4-6: selection policy

wherein R (x)_i) The sum of the distances among all individuals in the current individual candidate set is obtained; d (x)_i-x_j) And x_i-x_jGeneration of | isTable euclidean distance between the ith and jth individuals; k is the candidate set.

and 4-7: determining a stop condition

6. The FWA _ DBN-based customer credit assessment method and system according to claim 1, wherein: in step 5, for comparison with the DBN model optimized by genetic algorithm, particle swarm algorithm, and ant colony algorithm, the present invention adopts Mean Absolute Percentage Error (MAPE) and Root Mean Square Error (RMSE) as evaluation indexes, wherein:

is the actual value of the ith output training sample.

7. The FWA _ DBN-based customer credit assessment method and system according to claim 1, in S6, the credit scoring model of FWA _ DBN is deployed to a loan application platform to output a real-time application credit score, so as to implement real-time approval of the application customer, and periodically input performance data to the model training to implement online update of the model.

8. The FWA _ DBN-based customer credit assessment method and system according to claim 1, wherein: also provided is a FWA _ DBN-based customer credit evaluation system, comprising the following modules: