CN112529683A

CN112529683A - Method and system for evaluating credit risk of customer based on CS-PNN

Info

Publication number: CN112529683A
Application number: CN202011351678.6A
Authority: CN
Inventors: 江远强
Original assignee: Baiweijinke Shanghai Information Technology Co ltd
Current assignee: Baiweijinke Shanghai Information Technology Co ltd
Priority date: 2020-11-27
Filing date: 2020-11-27
Publication date: 2021-03-19

Abstract

The invention relates to the technical field of wind control in the Internet financial industry, in particular to a customer credit risk assessment method and system based on CS-PNN (CS-public network). compared with BP (back propagation) -RBF (radial basis function) neural networks, the PNN integrates Bayesian decision theory and density function estimation on the basis of a radial basis function, and has the advantages of simple network structure, less adjusting parameters, fast running time, no local minimum points and the like; compared with other optimization algorithms such as GA, PSO and ACO, the CS algorithm searches for a global optimal solution by combining simulated cuckoo nest parasitic reproduction behavior and the Levy flight search principle, has the advantages of less parameter setting, high convergence speed, high universality and robustness, easiness in implementation and the like, and can efficiently balance local search and global search of the algorithm; the CS-PNN model obtained by CS optimization of the smoothing factor of the PNN has the advantages of simple network structure, high convergence speed, good fault tolerance, high robustness, high classification accuracy, strong sample addition capability and the like, and can meet the requirement of real-time credit risk assessment of the loan system.

Description

Method and system for evaluating credit risk of customer based on CS-PNN

Technical Field

The invention belongs to the technical field of wind control in the Internet financial industry, and particularly provides a customer credit risk assessment method and system based on CS-PNN.

Background

With the rapid development of internet finance, credit assessment for credit risk of customers becomes an important research field. The credit risk assessment calculates the credit risk of the applicant by using information submitted by the loan or credit card applicant and information provided by a third party, divides the risk into different risk levels, and uses the risk levels as the basis for the approval of the loan or credit card.

Credit assessment is essentially a classification problem in pattern recognition, and machine learning methods are used to classify applicants into customers with different credit ratings according to their characteristics, such as age, gender, marital status, income, and the like. When the model is trained, the classification model is obtained according to the rule found by the historical data, and then the default risk of the future borrower is predicted through the model. In the prior art, machine learning methods such as logistic regression, support vector machines and decision trees are mainly adopted, in the aspect of internet financial credit evaluation, an artificial neural network is proved to be a study model with good performance in recent years, the artificial neural network is an information processing model simulating a brain, fitting of a nonlinear model is realized through modeling of a parallel neural network, and the artificial neural network has self-learning capability and connection storage capability.

The Probability Neural Network (PNN) is a novel feedforward type neural network combining radial basis function neurons and competitive neurons together, a Bayesian decision theory and density function estimation are fused on the basis of radial basis functions, the probability neural network is different from a BP network, a learning algorithm of the PNN does not adjust connection weights among the neurons in a training process, the learning of the network completely depends on data samples, training of a large amount of data is not needed, only a parameter of a smoothing factor needs to be determined, and the probability neural network has the advantages of simple network structure, high convergence speed, good fault tolerance, high robustness, high classification accuracy, strong sample adding capability and the like, and can meet the requirement of real-time data processing.

However, the performance of the PNN is greatly influenced by the smoothing factor delta, and the larger the delta value is in the PNN training, the smoother the fitting of the function is, the higher the prediction precision is, and the faster the operation rate is. Conversely, if the value of δ is too small, it means that many neurons are required to adapt to slow changes in the function, resulting in poor network performance. Therefore, the reasonable selection of the delta value plays a very important role in the classification performance of the network. In the prior art, a cluster intelligent optimization algorithm, such as a genetic algorithm, a particle swarm algorithm and an ant colony algorithm, is adopted to automatically optimize a smoothing factor, but the cluster intelligent optimization algorithm has some limitations in the aspects of the property, parameter adjustment, calculation time and the like of a target problem, most of the cluster intelligent optimization algorithm is biased to global search capability or local search capability, few of the cluster intelligent optimization algorithm and the ant colony algorithm can balance the target problem, the learning training has the problems of long learning process, easiness in precocity, easiness in falling into local optimization, poor robustness and the like, and the learning speed, the learning precision and the optimization searching efficiency are not high.

The optimal smoothing factor delta becomes a bottleneck problem of the PNN, and how to adopt a more suitable intelligent algorithm to optimize the smoothing factor delta of the PNN and apply the smoothing factor delta to the credit risk assessment of the client is a technical problem to be solved urgently by professionals in the field.

Disclosure of Invention

The invention aims to provide a method and a system for evaluating credit risk of a customer based on a CS-PNN (CS-PNN network), so as to solve the problems in the background technology.

In order to achieve the purpose, the invention provides the following technical scheme:

a method and a system for evaluating credit risk of a customer based on CS-PNN comprise the following steps:

s1, constructing sample data, sampling a client with existing loan expression as a modeling sample, and collecting credit characteristic data of the client;

s2, preprocessing the acquired data, normalizing the preprocessed data by adopting a Min-Max method, and dividing the preprocessed data into a training set and a test set according to a ratio of 7: 3;

s3, selecting 10 feature vectors which can most influence the repayment state as input by using logistic regression or random forest at the root of the training set, and establishing a PNN prediction model by using whether the repayment is overdue or not as output;

s4, optimizing the smoothing factor of the PNN by using a CS algorithm, wherein the optimization algorithm takes the accuracy of the model as a target, obtains the optimal smoothing factor by iterative optimization, and outputs an optimization result as an initial parameter of the PNN to obtain a CS-PNN prediction model;

s5, predicting the data of the test set by using the trained CS-PNN prediction model, and evaluating the data by using the root mean square error, the average relative error and the Hill inequality coefficient to verify the quality of the optimization model and obtain the optimized system model;

and S6, deploying the CS-PNN model after offline training to an application platform, extracting characteristic values from online real-time application user data, normalizing the characteristic values, inputting the normalized characteristic values into a trained probabilistic neural network, and outputting a user credit risk score.

Preferably, in S1, the client of the existing loan performance is sampled as a modeling sample, and the credit feature data of the client is collected, wherein the credit feature data comprises personal basic information, operation behavior buried point data and third party data.

Preferably, in S2, in the neural network input quantities, the magnitude of each input quantity is different greatly due to the difference in units; if direct input quantity input is adopted, neuron training is saturated, so that before input training, data must be normalized to be in the same number level, the preprocessed data are normalized by adopting a Min-Max method, and a calculation formula is as follows:

wherein the content of the first and second substances,

for normalized data, D_maxAs the maximum value of the training sample set, D_minAs the minimum of the training sample set, D_iThe data itself.

Reconstructing the normalized training sample to respectively obtain an input matrix X and a corresponding output moment Y, and according to the application time, according to the proportion of 7:3 into training and testing sets.

Preferably, in S3, the Probabilistic Neural Network (PNN) is a feedforward type neural network in which radial basis function neurons and competitive neurons are combined together, a gaussian function is used as a basis function, and the neural network is obtained according to probability density function estimation and bayesian classification rules, unlike the BP network, a learning algorithm of the PNN does not adjust a connection weight between neurons in a training process, learning of the network completely depends on data samples, without training of a large amount of data, only one parameter of a smoothing factor needs to be determined, and the PNN has the advantages of simple operation, high robustness, high parallel structure, parallel implementation capability, and the like, and can meet the requirement of processing data in real time.

A PNN is constructed, the PNN network including an input layer, a mode layer, a summing layer, and an output layer. Determining the number of nodes of an input layer, a hidden layer, a summation layer and an output layer of the PNN to be established; and establishing a PNN network model by taking the training sample data as the input of the PNN. The method specifically comprises the following steps:

step 3-1: input layer

The first layer of the PNN, called the input layer, receives values from training samples and passes on feature vectors to the network, the number of sample feature dimensions being equal to the number of neurons in the input layer, and the input vector X (X ═ y₁,x₂,…,x_m)^TAnd n is the sample dimension.

Step 3-2: mode layer

The second layer of the PNN is called a mode layer, and is connected to the input layer by a connection weight, the number of neurons in the mode layer is equal to the sum of the training samples of each category, the mode layer calculates the matching relationship between the input feature vector and each mode in the training set, that is, the similarity, and sends the distance to a gaussian function to obtain the output of the mode layer, and the output of the mode layer can be expressed as:

wherein X is an input feature vector, W_iδ is the smoothing factor, which is the weight between the input layer and the mode layer.

Step 3-3: summation layer

The third layer of the PNN, called the summation layer, is responsible for connecting the mode layer elements of each class, each class having only one summation layer element, which is added only to the outputs of the mode layer elements belonging to its class, not connected to the mode layers of the other elements, whose outputs are proportional to the estimates of the kernel-based probability densities of the classes. The number of neurons in the summation layer is the number of classes of samples.

Step 3-4: output layer

The fourth layer of PNN is called the output layer, where there are several threshold discriminators, the neuron of which is a competitive neuron, and the one with the largest posterior probability density among the estimated probability densities is taken as the output of the whole system. The output layer neuron number is equal to the class number of the training sample data, receives various probability density functions output from the summation layer, and outputs an m-dimensional vector Y ═ Y₁,y₂,…,y_m)^TThe output of the output layer can be expressed as:

wherein f (x) is a probability density function; p is the dimension of the training sample feature vector; x is the number of_iIs a training sample feature vector; m is the number of training samples; δ is a smoothing factor; the value of delta determines the width of a bell-shaped curve taking a sample point as a center, is the only parameter needing to be adjusted in the PNN, and has a great influence on the classification accuracy of the sample.

Preferably, in S4, in order to optimize the only parameter to be adjusted in the PNN to be the smoothing factor δ value, the CS algorithm is used to find the optimal smoothing factor of the probabilistic neural network, which specifically includes:

s41: according to a given training sample, determining a network topology structure of the PNN and the number of nodes of each layer, and determining initialization parameters of a CS algorithm, wherein the method comprises the following steps: population size M, maximum number of iterations t_maxDiscovery probability P_aAnd a step size factor alpha₀；

S42: coding the smoothing factor delta to be optimized, and randomly generating M bird nests within a specified range

Each bird nest is a set of parameters for a smoothing factor δ, corresponding to a set of solutions for the smoothing factor δ { δ }₁,δ₂,δ₃,…,δ_M},i＝1,2,...,M；

S43: determining a fitness function, and evaluating the fitness of each nest in the population by using the following formula:

where n is the total number of samples, y' (i), and y (i) are the actual output value and the expected output value of the ith sample, respectively.

S44: for each cuckoo, a levy flight is carried out, the aim of which is to replace the less good nests with new and possibly better nests, the path and position updating formula for each cuckoo nest is as follows:

wherein the content of the first and second substances,

and

respectively representing the position vectors of the ith bird nest in the t generation and the t +1 generation;

is point-to-point multiplication; alpha is step length control quantity, determines random search range, generally takes 0.1, continuously updates step length in search, reduces search range, and step length updating formula is as followsThe following:

wherein alpha is₀Is a constant number, x_bestIndicating the current nest with the best quality;

l (lambda) is a Levy flight random search path, obeys Levy distribution, and meets the following conditions:

where both μ and v follow a normal distribution.

Wherein Γ represents a standard gamma function; β ═ 1.5; x is the number of_bestIndicating the current best quality nest.

S45: after the position is updated, a random number r is generated to be equal to [0,1 ∈]And probability of discovery P_aBy contrast, if r > P_aThen according to the Levy principle

Randomly changing, calculating the fitness value of the new population, comparing with the fitness value of the previous generation population, keeping the better fitness value, and recording the optimal bird nest x_best(ii) a Otherwise, the value is kept unchanged.

S46: judging the iteration times: if less than the maximum number of iterations t_maxRepeating the step S44 and the step S45, and continuing the next iteration until the condition is met; otherwise, ending the algorithm and enabling the optimal bird nest X_bestObtaining the optimal smoothing factor delta_best；

S47: the optimized smoothing factor delta_bestSubstituting into PNN framework, inputting training sample to train CS-PNN prediction model.

Preferably, in step 5, in order to analyze the model prediction performance in comparison with a PNN model optimized by a genetic algorithm, a particle swarm optimization, and an ant colony optimization, 3 indexes of Root Mean Square Error (RMSE), Average Relative Error (ARE), and hil unequal coefficient (Theil IC) ARE selected to evaluate the prediction effect of the model, and the formula is as follows:

wherein, y_iTo test the true value of the sample set, y_i' is the predicted value, and n is the number of samples.

The RMSE and the ARE ARE respectively used for measuring the discrete degree and the integral error of the model, and the smaller the numerical value of the RMSE and the ARE is, the smaller the prediction error of the model is, the more stable the model is and the better the effect is; the Theil IC is taken in the (0,1) interval, and the closer to 0, the smaller the error is, and the better the prediction performance of the model is.

Preferably, in step 6, deploying the offline-trained CS-PNN model to an application platform, extracting characteristic values from online real-time application user data, normalizing the characteristic values, inputting the normalized characteristic values into the trained CS-PNN model, outputting user credit assessment, and inputting performance data into the model for training at regular intervals to realize online updating of the model.

Preferably, the evaluation system for the credit risk of the client based on the CS-PNN is also provided, and comprises the following modules:

a dataset acquisition and labeling module: the loan system back end is used for obtaining a training data set comprising application, repayment, operation and third-party data;

the data preprocessing and normalization processing module is used for: the data preprocessing process comprises data cleaning, missing value processing, abnormal value processing, data transformation and data formatting, and the preprocessing is subjected to normalization processing and is divided into a training set and a test set;

a PNN model construction module: determining the number of nodes of an input layer, a hidden layer, a summation layer and an output layer of the probabilistic neural network to be established; taking training sample data as input of a probabilistic neural network, and establishing a probabilistic neural network model;

a CS-PNN model construction module: obtaining an optimal smoothing factor by iteration optimization through a CS algorithm, and outputting an optimization result as an initial parameter of the PNN to obtain a CS-PNN prediction model;

the PNN training test module is used for training the optimized PNN by using a training set and verifying by using a test set to obtain the accuracy of model prediction;

a PNN prediction module: and (4) carrying out credit risk assessment prediction on the online application client by using the trained PNN model.

Compared with the prior art, the invention has the beneficial effects that:

1. compared with BP and RBF neural networks, the PNN integrates Bayes decision theory and density function estimation on the basis of radial basis functions, and has the advantages of simple network structure, less adjusting parameters, fast running time, no local minimum points and the like.

2. Compared with other optimization algorithms such as GA, PSO and ACO, the CS algorithm searches for a global optimal solution by combining simulated cuckoo nest parasitic reproduction behavior and the Levy flight search principle, has the advantages of less parameter setting, high convergence speed, high universality and robustness, easiness in implementation and the like, and can efficiently balance local search and global search of the algorithm.

3. In the invention, the CS-PNN model obtained by CS optimization of the smoothing factor of the PNN has the advantages of simple network structure, high convergence speed, good fault tolerance, high robustness, high classification accuracy, strong sample addition capability and the like, and can meet the requirement of real-time credit risk assessment of a loan system.

Drawings

Fig. 1 is a schematic view of the overall structure of the present invention.

Detailed Description

Example 1:

referring to fig. 1, the present invention provides a technical solution:

At S1, the client of the existing loan performance is sampled as a modeling sample, and credit characteristic data of the client is collected, the credit characteristic data includes personal basic information, operating behavior buried point data and third party data, the arrangement is favorable for collecting user data,

in S2, in the neural network input quantities, the orders of magnitude differ greatly due to the unit difference of each input quantity; if direct input quantity input is adopted, neuron training is easy to saturate, therefore, before input training, data must be normalized to be in the same number level, the preprocessed data is normalized by adopting a Min-Max method, and a calculation formula is as follows:

wherein the content of the first and second substances,

Reconstructing the normalized training sample to respectively obtain an input matrix X and a corresponding output moment Y, and according to the application time, according to the proportion of 7:3 into a training set and a test set, which facilitates data processing,

in S3, a Probabilistic Neural Network (PNN) is a feedforward type neural network in which radial basis function neurons and competitive neurons are combined together, a gaussian function is used as a basis function, and a neural network is obtained according to probability density function estimation and bayesian classification rules, which is different from a BP network, a learning algorithm of the PNN does not adjust a connection weight between neurons in a training process, learning of the network completely depends on data samples, training of a large amount of data is not needed, only a smoothing factor is needed to be determined, and the PNN has the advantages of simple operation, high robustness, high parallel structure, parallel realization capability and the like, and can meet the requirement of processing data in real time.

step 3-1: input layer

Step 3-2: mode layer

Step 3-3: summation layer

Step 3-4: output layer

The fourth layer of PNN is called the output layer, where there are several threshold discriminators, the neuron of which is a competitive neuron, and the one with the largest posterior probability density among the estimated probability densities is taken as the output of the whole system. The output layer has neuron number equal to the number of kinds of training sample data, receives the probability density functions of various kinds output from the summation layer, and outputsGiving m-dimensional vector Y ═ Y₁,y₂,…,y_m)^TThe output of the output layer can be expressed as:

wherein f (x) is a probability density function; p is the dimension of the training sample feature vector; x is the number of_iIs a training sample feature vector; m is the number of training samples; δ is a smoothing factor; the value of delta determines the width of a bell-shaped curve taking a sample point as a center, is the only parameter needing to be adjusted in the PNN, has a great influence on the classification accuracy of the sample, and is favorable for controlling the accuracy of the sample classification by defining the value of a smoothing factor delta,

in S4, in order to optimize that the only parameter to be adjusted in the PNN is the smoothing factor δ value, a CS algorithm is used to find the optimal smoothing factor of the probabilistic neural network, which specifically includes:

wherein the content of the first and second substances,

and

is point-to-point multiplication; alpha is step length control quantity, a random search range is determined, generally 0.1 is taken, the step length is continuously updated in the search, the search range is narrowed, and a step length updating formula is as follows:

and L (lambda) is L vy, the flight random search path obeys L vy distribution, and the following conditions are met:

where both μ and v follow a normal distribution.

Randomly changing, calculating the fitness value of the new population, comparing with the fitness value of the previous generation population, keeping the better fitness value, and recording the optimal bird nest x_bestOtherwise, the value is kept unchanged.

S46: judging the iteration times: if less than the maximum number of iterations t_maxRepeating the step S44 and the step S45, continuing the next iteration until the condition is met, otherwise, ending the algorithm, and enabling the optimal bird nest X_bestObtaining the optimal smoothing factor delta_best；

S47: the optimized smoothing factor delta_bestSubstituting into PNN framework, inputting training sample for CS-PNN prediction model training, which is favorable for accurately determining optimal smoothing factor delta_best，

In step 5, in order to analyze the model prediction performance in comparison with a DBN model optimized by a genetic algorithm, a particle swarm algorithm and an ant colony algorithm, 3 indexes of Root Mean Square Error (RMSE), Average Relative Error (ARE) and hil unequal coefficient (Theil IC) ARE selected to evaluate the prediction effect of the model, and the formula is as follows:

wherein, y_iIs the true value, y 'of the test sample set'_iThe prediction value is a prediction value, n is the number of samples, RMSE and ARE ARE respectively used for measuring the discrete degree and the integral error of the model, and the smaller the numerical value of the RMSE and the ARE is, the smaller the prediction error of the model is, the more stable the model is and the better the effect is; the Theil IC is valued in the interval (0,1), the closer to 0, the smaller the error is, the better the prediction performance of the model is, the arrangement is favorable for intuitively judging the prediction performance of the model,

in step 6, deploying the CS-PNN model after off-line training to an application platform, extracting characteristic values of the online real-time application user data, normalizing the characteristic values, inputting the normalized characteristic values into a trained probabilistic neural network, outputting user credit assessment, inputting performance data into the model training at regular intervals to realize online updating of the model, and the device is beneficial to continuous optimization of the model,

also provided is a system for assessing credit risk of a customer based on a CS-PNN, comprising the following modules:

a CS-PNN model construction module: performing iterative optimization on the probabilistic neural network by using a CS algorithm to obtain an optimal smoothing factor, and outputting an optimization result as an initial parameter of the PNN to obtain a CS-PNN prediction model;

a PNN prediction module: the trained PNN model is used for carrying out credit risk assessment prediction on the online application client, and the setting can predict the credit level of the client in advance, so that the client can be provided with corresponding credit rights conveniently.

The working process is as follows: the invention comprises the following steps:

and S6, deploying the CS-PNN model after offline training to an application platform, extracting characteristic values from online real-time application user data, normalizing the characteristic values, inputting the normalized characteristic values into the CS-PNN model, and outputting the credit risk assessment of the user.

Collecting customer data as input through S1-S6, and outputting a credit risk score of the customer by using the CS-PNN model to predict whether the customer will pay within the term.

The same parts of embodiment 2 as embodiment 1 are not repeated, except that: at S1, the client of the existing loan performance is sampled as a modeling sample, and credit characteristic data of the client is collected, the credit characteristic data includes personal basic information, operating behavior buried point data and third party data, the arrangement is favorable for collecting user data,

wherein the content of the first and second substances,

step 3-1: input layer

Step 3-2: mode layer

Step 3-3: summation layer

Step 3-4: output layer

The fourth layer of PNN is called the output layer, where there are several threshold discriminationsAnd the neuron is a competitive neuron, and the maximum posterior probability density in the estimated probability densities is used as the output of the whole system. The output layer neuron number is equal to the class number of the training sample data, receives various probability density functions output from the summation layer, and outputs an m-dimensional vector Y ═ Y₁,y₂,…,y_m)^TThe output of the output layer can be expressed as:

wherein the content of the first and second substances,

and

where both μ and v follow a normal distribution.

wherein, y_iTo test the true value of the sample set, y_iThe' is a predicted value, n is the number of samples, RMSE and ARE ARE respectively used for measuring the discrete degree and the integral error of the model, and the smaller the numerical value of the RMSE and the ARE is, the smaller the predicted error of the model is, the more stable the model is and the better the effect is; the Theil IC is valued in the interval (0,1), the closer to 0, the smaller the error is, the better the prediction performance of the model is, the arrangement is favorable for intuitively judging the prediction performance of the model,

Compared with BP and RBF neural networks, the PNN integrates Bayes decision theory and density function estimation on the basis of radial basis functions, and has the advantages of simple network structure, less adjusting parameters, fast running time, no local minimum value point and the like; compared with other optimization algorithms such as GA, PSO and ACO, the CS algorithm searches for a global optimal solution by combining simulated cuckoo nest parasitic reproduction behavior and the Levy flight search principle, has the advantages of less parameter setting, high convergence speed, high universality and robustness, easiness in implementation and the like, and can efficiently balance local search and global search of the algorithm; the CS-PNN model obtained by CS optimization of the smoothing factor of the PNN has the advantages of simple network structure, high convergence speed, good fault tolerance, high robustness, high classification accuracy, strong sample addition capability and the like, and can meet the requirement of real-time credit risk assessment of the loan system

The principles and embodiments of the present invention are explained herein using specific examples, which are presented only to assist in understanding the method and its core concepts of the present invention. The foregoing is only a preferred embodiment of the present invention, and it should be noted that there are objectively infinite specific structures due to the limited character expressions, and it will be apparent to those skilled in the art that a plurality of modifications, decorations or changes may be made without departing from the principle of the present invention, and the technical features described above may be combined in a suitable manner; such modifications, variations, combinations, or adaptations of the invention using its spirit and scope, as defined by the claims, may be directed to other uses and embodiments.

Claims

1. A method and a system for evaluating credit risk of a customer based on CS-PNN are characterized in that: the method comprises the following steps:

and S6, deploying the CS-PNN model after offline training to an application platform, extracting characteristic values of online real-time application user data, normalizing the characteristic values, inputting the normalized characteristic values into the CS-PNN model, and outputting the credit risk score of the user.

2. The method and system for assessing credit risk of a customer based on CS-PNN as claimed in claim 1, wherein: at S1, the client of the existing loan expression is sampled as a modeling sample, and the credit characteristic data of the client is collected, wherein the credit characteristic data comprises personal basic information, operation behavior buried point data and third-party data.

3. The method and system for assessing credit risk of a customer based on CS-PNN as claimed in claim 1, wherein: in S2, in the input quantities of the neural network, because the units of the input quantities are different and the magnitude differences are large, if the input quantities are directly input, the training of neurons is easily saturated, so before the input training, the data must be normalized to be in the same magnitude, and the preprocessed data must be normalized by the Min-Max method, and the calculation formula is as follows:

wherein the content of the first and second substances,

4. The method and system for assessing credit risk of a customer based on CS-PNN as claimed in claim 1, wherein: in S3, a Probabilistic Neural Network (PNN) is a feedforward type neural network in which radial basis function neurons and competitive neurons are combined together, a gaussian function is used as a basis function, and a neural network is obtained according to probability density function estimation and bayesian classification rules, which is different from a BP network, a learning algorithm of the PNN does not adjust a connection weight between neurons in a training process, learning of the network completely depends on data samples, training of a large amount of data is not needed, only a smoothing factor is needed to be determined, and the PNN has the advantages of simple operation, high robustness, high parallel structure, parallel realization capability and the like, and can meet the requirement of processing data in real time.

step 3-1: input layer

Step 3-2: mode layer

Step 3-3: summation layer

Step 3-4: output layer

The fourth layer of PNN is called the output layer, among themAnd a plurality of threshold discriminators, wherein the neuron is a competitive neuron, and the output of the whole system is the maximum posterior probability density in the estimated probability densities. The output layer neuron number is equal to the class number of the training sample data, receives various probability density functions output from the summation layer, and outputs an m-dimensional vector Y ═ Y₁,y₂,…,y_m)^TThe output of the output layer can be expressed as:

5. The method and system for assessing credit risk of a customer based on CS-PNN as claimed in claim 1, wherein: in S4, in order to optimize that the only parameter to be adjusted in the PNN is the smoothing factor δ value, a CS algorithm is used to find the optimal smoothing factor of the probabilistic neural network, which specifically includes:

wherein the content of the first and second substances,

and

where both μ and v follow a normal distribution.

6. The FWA _ DBN-based customer credit assessment method and system according to claim 1, wherein: in step 5, in order to analyze the model prediction performance in comparison with a PNN model optimized by a genetic algorithm, a particle swarm algorithm and an ant colony algorithm, 3 indexes of Root Mean Square Error (RMSE), Average Relative Error (ARE) and hil unequal coefficient (Theil IC) ARE selected to evaluate the prediction effect of the model, and the formula is as follows:

wherein, y_iIs the true value, y 'of the test sample set'_iThe prediction value is a prediction value, n is the number of samples, RMSE and ARE ARE respectively used for measuring the discrete degree and the integral error of the model, and the smaller the numerical value of the RMSE and the ARE is, the smaller the prediction error of the model is, the more stable the model is and the better the effect is; the Theil IC is taken in the (0,1) interval, and the closer to 0, the smaller the error is, and the better the prediction performance of the model is.

7. The method and system for assessing client credit risk based on CS-PNN as claimed in claim 1, wherein in step 6, the offline trained CS-PNN model is deployed to the application platform, the online real-time application user data is extracted, the characteristic values are normalized and input into the trained CS-PNN model, the user credit assessment is output, and the presence of performance data is input into the model training periodically to realize online updating of the model.

8. The method and system for assessing credit risk of a customer based on CS-PNN as claimed in claim 1, wherein: also provided is a system for assessing credit risk of a customer based on a CS-PNN, comprising the following modules: