CN112330435A - Credit risk prediction method and system for optimizing Elman neural network based on genetic algorithm - Google Patents
Credit risk prediction method and system for optimizing Elman neural network based on genetic algorithm Download PDFInfo
- Publication number
- CN112330435A CN112330435A CN202011049256.3A CN202011049256A CN112330435A CN 112330435 A CN112330435 A CN 112330435A CN 202011049256 A CN202011049256 A CN 202011049256A CN 112330435 A CN112330435 A CN 112330435A
- Authority
- CN
- China
- Prior art keywords
- neural network
- data
- elman neural
- credit
- genetic algorithm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 104
- 230000002068 genetic effect Effects 0.000 title claims abstract description 38
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000012549 training Methods 0.000 claims abstract description 42
- 238000012545 processing Methods 0.000 claims abstract description 31
- 238000003062 neural network model Methods 0.000 claims abstract description 24
- 238000010606 normalization Methods 0.000 claims abstract description 24
- 238000005457 optimization Methods 0.000 claims abstract description 16
- 238000012360 testing method Methods 0.000 claims abstract description 13
- 230000014509 gene expression Effects 0.000 claims abstract description 10
- 230000002159 abnormal effect Effects 0.000 claims abstract description 7
- 238000013507 mapping Methods 0.000 claims abstract description 7
- 238000012217 deletion Methods 0.000 claims abstract description 4
- 230000037430 deletion Effects 0.000 claims abstract description 4
- 238000007781 pre-processing Methods 0.000 claims abstract description 4
- 210000002569 neuron Anatomy 0.000 claims description 24
- 210000000349 chromosome Anatomy 0.000 claims description 12
- 108090000623 proteins and genes Proteins 0.000 claims description 8
- 238000013475 authorization Methods 0.000 claims description 6
- 238000011156 evaluation Methods 0.000 claims description 5
- 230000008054 signal transmission Effects 0.000 claims description 5
- 238000011478 gradient descent method Methods 0.000 claims description 4
- 239000002245 particle Substances 0.000 claims description 4
- 230000001133 acceleration Effects 0.000 claims description 3
- 238000004836 empirical method Methods 0.000 claims description 3
- 230000005284 excitation Effects 0.000 claims description 3
- 238000011835 investigation Methods 0.000 claims description 3
- 238000012886 linear function Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 description 16
- 230000008569 process Effects 0.000 description 6
- 230000007547 defect Effects 0.000 description 5
- 230000035772 mutation Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000012938 design process Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000004205 output neuron Anatomy 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000002922 simulated annealing Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/03—Credit; Loans; Processing thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/12—Computing arrangements based on biological models using genetic models
- G06N3/126—Evolutionary algorithms, e.g. genetic algorithms or genetic programming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0635—Risk analysis of enterprise or organisation activities
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Software Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biomedical Technology (AREA)
- Strategic Management (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Accounting & Taxation (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Genetics & Genomics (AREA)
- Physiology (AREA)
- Educational Administration (AREA)
- Technology Law (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
Abstract
The invention discloses a credit risk prediction method and a credit risk prediction system based on genetic algorithm optimization Elman neural network, which comprises the following steps: s1, collecting data, selecting a certain proportion and quantity of normal repayment and overdue customers as modeling samples according to the post-loan expression from the back end of an internet financial platform, and collecting personal basic information and monitoring software when a sample customer account is registered and applied to obtain operation behavior buried point data; s2, preprocessing data, namely performing deletion completion, abnormal value processing and normalization processing on the acquired data, and then dividing the acquired data into a training set and a test set according to the proportion of 7: 3; s3, determining an Elman neural network topological structure by utilizing the sample data of the training set; s4, setting relevant parameters of a genetic algorithm, combining the relevant parameters with a neural network model, and utilizing an initial weight and a threshold value of a training set sample to the optimized neural network; the Elman neural network has the characteristics of dynamic and nonlinear mapping, and is particularly suitable for credit assessment prediction of Internet finance.
Description
Technical Field
The invention belongs to the technical field of wind control in the Internet financial industry, and particularly relates to a credit risk prediction method and system for optimizing an Elman neural network based on a genetic algorithm.
Background
In recent years, artificial neural networks have proven to be a well-behaved research model in internet financial credit assessment. An advantage of the artificial neural network model to predict, discover and summarize the structure of financial variables is that it does not rely on specific assumptions. Most currently used in credit evaluation applications are the BP neural network and the RBF neural network, or some improvement based on both networks. However, these two neural networks have various disadvantages in processing data for prediction: the BP network is based on a gradient descent algorithm and has the defects of local minimum, low robustness and the like; the RBF neural network belongs to a static feedforward network, has defects in processing the dynamic time modeling problem, and cannot well meet the requirement of mutual credit evaluation.
The Elman neural network is a neural network with feedback, and a layer of supporting layer is added on the basis of a BP neural network and used for storing the output of a hidden layer at the previous moment and calculating time delay data, so that the Elman neural network has the capability of dynamic storage. Through data training, the data training has the characteristics of dynamic and nonlinear mapping and is suitable for the prediction problem of time series data, while financial data is taken as typical time series data, and the Elman neural network is particularly suitable for credit assessment prediction of Internet finance.
Although the performance of the Elman neural network is improved for the traditional neural network, in the design process, the problems of selection and optimization of a training algorithm, a transmission function, a network structure, a hidden layer connection weight value, a threshold value and the like still exist, in the prior art, gradient descent methods, particle swarm algorithms or simulated annealing algorithms and the like are used for optimizing parameters of the Elman neural network, but the Elman neural network has the defects of unstable convergence process, low convergence speed, easiness in falling into local optimum and the like, so that a more appropriate optimization algorithm needs to be adopted, and therefore a credit risk prediction method and a credit risk prediction system based on genetic algorithm optimization for the Elman neural network are provided.
Disclosure of Invention
The invention aims to provide a credit risk prediction method and a credit risk prediction system based on genetic algorithm optimization Elman neural network, so as to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme: a credit risk prediction method for optimizing an Elman neural network based on a genetic algorithm comprises the following steps:
s1, collecting data, selecting a certain proportion and quantity of normal repayment and overdue customers as modeling samples according to the post-loan expression from the back end of an internet financial platform, and collecting personal basic information and monitoring software when a sample customer account is registered and applied to obtain operation behavior buried point data;
s2, preprocessing data, namely performing deletion completion, abnormal value processing and normalization processing on the acquired data, and then dividing the acquired data into a training set and a test set according to the proportion of 7: 3;
s3, determining an Elman neural network topological structure by utilizing the sample data of the training set;
s4, setting relevant parameters of a genetic algorithm, combining the relevant parameters with a neural network model, and obtaining a credit score prediction model of the GA-Elman neural network by using the initial weight and the threshold of the training set sample to the optimized neural network;
s5, inputting test set data into a GA-Elman neural network for testing the prediction performance, and comparing the test set data with an Elman neural network model optimized by a gradient descent method and a particle swarm algorithm;
and S6, deploying a credit scoring model of the genetic algorithm optimized Elman neural network to an application platform to output a real-time application credit score, and inputting the represented data to model training at regular intervals.
Preferably, in S1, a certain proportion and quantity of normal repayment and overdue customers are selected as modeling samples from the back end of the internet financial platform according to the post-loan performance, and personal basic information when a sample customer account registration application is acquired and operation behavior buried point data is acquired from monitoring software, where the personal basic information includes: the mobile phone number, the academic calendar, the marital status, the working unit, the address, the contact information, the personal basic information, the credit transaction information, the public information and the special record data which are acquired by the credit investigation report; the data of the buried point comprises equipment behavior data and log data which are acquired when the point is buried, wherein the equipment behavior data comprises: the number of times, the number of clicks, the click frequency, the total input time and the average time, the mobile phone number data, the GPS position, the MAC address, the IP address data, the geographic information application frequency, the IP application frequency, the equipment electric quantity ratio and the average acceleration of the gyroscope of logging on the platform, wherein the log data comprises: login times within 7 days, time from the first click to the application of credit authorization, the maximum number of sessions within one day, behavior statistics of the week before the application of credit authorization, mobile internet behavior data, behavior data in the loan APP, credit history and universe multi-dimensional big data including operator data.
Preferably, in S2, the input quantities of the neural networks have different units and have larger numerical differences, and before input training, normalization processing needs to be performed on the original data variables to make them in the same dimension, where the normalization processing includes normalization processing and inverse normalization processing.
Preferably, the Elman neural network in S3 is a typical dynamic neuron network, and the Elman neural network is based on the basic structure of the BP artificial neural network, and has a function of mapping dynamic features by storing internal states, so that the system has the capability of adapting to time-varying characteristics, and the Elman neural network includes an input layer, an implicit layer, an output layer and a socket layer, neurons of the input layer are used for performing signal transmission, neurons of the output layer are used for performing linear weighting, an excitation function of the implicit layer selects a linear or non-linear function, and the socket layer is self-connected to an input of the implicit layer, so as to implement delay and storage of an output of the implicit layer.
Preferably, in S3, the number of input layer neuron and the number of output layer neuron are determined according to the input/output parameters, and the number of hidden layer neuron is determined by an empirical method and a trial and error method.
Preferably, the genetic algorithm related parameters are set in S4 and combined with the neural network model, the individual coding mode is real number coding during coding, the value ranges of the weight w and the threshold b of the Elman neural network are set, a group of real number sets of individuals are selected as chromosomes by using an interpolation method, the chromosome gene coding Elman neural network model parameter combination (w, b) is performed in a binary form, an initial population is randomly generated, the minimum model output error is used as a fitness function, the optimal individual is found through selection, intersection and variation operations, and the initial weight and threshold combination (w, b) of the neural network is determined, so that the Elman neural network model with the optimal performance is obtained.
Preferably, in S5, the root mean square error indicator is used to analyze the prediction result, if the error is large, the training is performed again, and if the error is within an allowable range, the GA-Elman neural network is trained to be qualified.
Preferably, in S6, the credit scoring model of the genetic algorithm optimization Elman neural network is deployed to an application platform to output a real-time application credit score for realizing real-time approval of an application client, and in S6, performance data is periodically input to model training for realizing online updating of the model.
The invention also provides a credit risk prediction system based on genetic algorithm optimization Elman neural network, which comprises a sample acquisition unit: the system comprises a training sample, a data acquisition module and a data processing module, wherein the training sample is used for acquiring personal application information, operation behavior buried point data and post-loan repayment performance as evaluation results;
a data processing unit: extracting the collected data characteristics, and performing data missing completion, abnormal value processing and normalization;
a model training unit: setting relevant parameters of a genetic algorithm, combining the relevant parameters with an Elman neural network model, and optimizing the initial weight and the threshold of the neural network to obtain a credit score prediction model of the GA-Elman neural network;
a prediction unit: and the Elman neural network used for training completion carries out credit risk prediction on the online application client.
Compared with the prior art, the invention has the beneficial effects that:
(1) the Elman neural network has the characteristics of dynamic and nonlinear mapping, and is particularly suitable for credit assessment prediction of Internet finance.
(2) The method utilizes the nonlinear optimization capability of the genetic algorithm to automatically adjust the parameters of the Elman neural network model, carries out optimization search globally to obtain the optimal weight threshold, overcomes the defects of unstable convergence process, low convergence speed, easy falling into local optimization and the like of the traditional Elman convergence process, and improves the stability and generalization capability of the Elman neural network.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
Referring to fig. 1, the present invention provides a technical solution: a credit risk prediction method for optimizing an Elman neural network based on a genetic algorithm comprises the following steps:
s1, collecting data, selecting a certain proportion and quantity of normal repayment and overdue customers as modeling samples according to the post-loan expression from the back end of an internet financial platform, and collecting personal basic information and monitoring software when a sample customer account is registered and applied to obtain operation behavior buried point data;
s2, preprocessing data, namely performing deletion completion, abnormal value processing and normalization processing on the acquired data, and then dividing the acquired data into a training set and a test set according to the proportion of 7: 3;
s3, determining an Elman neural network topological structure by utilizing the sample data of the training set, wherein the Elman neural network topological structure comprises the neuron numbers of an input layer, an output layer and a hidden layer of the network, the number of the hidden layers, the weight of an initialized neural network and a threshold;
s4, setting relevant parameters of a genetic algorithm, combining the relevant parameters with a neural network model, and obtaining a credit score prediction model of the GA-Elman neural network by using the initial weight and the threshold of the training set sample to the optimized neural network;
s5, inputting test set data into a GA-Elman neural network for testing the prediction performance, and comparing the test set data with an Elman neural network model optimized by a gradient descent method and a particle swarm algorithm;
and S6, deploying a credit scoring model of the genetic algorithm optimized Elman neural network to an application platform to output a real-time application credit score, and inputting the represented data to model training at regular intervals.
In this embodiment, preferably, in S1, a certain proportion and a certain number of normal repayment and overdue customers are selected as modeling samples from the back end of the internet financial platform according to the post-loan performance, personal basic information at the time of a sample customer account registration application is acquired, and operation behavior buried point data is acquired from monitoring software, where the personal basic information includes: the mobile phone number, the academic calendar, the marital status, the working unit, the address, the contact information, the personal basic information, the credit transaction information, the public information and the special record data which are acquired by the credit investigation report; the data of the buried point comprises equipment behavior data and log data which are acquired when the point is buried, wherein the equipment behavior data comprises: the number of times, the number of clicks, the click frequency, the total input time and the average time, the mobile phone number data, the GPS position, the MAC address, the IP address data, the geographic information application frequency, the IP application frequency, the equipment electric quantity ratio and the average acceleration of the gyroscope of logging on the platform, wherein the log data comprises: login times within 7 days, time from the first click to the application of credit authorization, the maximum number of sessions within one day, behavior statistics of the week before the application of credit authorization, mobile internet behavior data, behavior data in the loan APP, credit history and universe multi-dimensional big data including operator data.
In this embodiment, preferably, in S2, the input quantities of the neural networks have different units and have larger numerical differences, and before input training, normalization processing needs to be performed on the original data variables to make them in the same dimension, where the normalization processing includes normalization processing and inverse normalization processing, and the expressions are respectively as follows:
the expression of the normalization process is:
x=(xmax-xmin)/2+(xmax+xmin)/2
the denormalization processing expression is:
xi=(xmax-xmin)·yi+xmin
wherein x ismax、xminRespectively representing the maximum value and the minimum value of the input quantity of the training sample; x is the number ofi、yiThe values before and after normalization of the input samples are respectively.
The original data is mapped to a [0,1] interval through normalization processing, so that the influence of original variables caused by different dimensions and large numerical value difference is effectively eliminated, and the predicted value obtained by the model is finally restored through inverse normalization processing to obtain a real numerical value.
In this embodiment, preferably, the Elman neural network in S3 is a typical dynamic neuron network, and the Elman neural network is based on a BP artificial neural network basic structure, and has a function of mapping dynamic features by storing internal states, so that a system has a capability of adapting to time-varying characteristics, and the Elman neural network includes an input layer, an implicit layer, an output layer, and a socket layer, neurons of the input layer are used for performing a signal transmission function, neurons of the output layer are used for performing a linear weighting function, an excitation function of the implicit layer selects a linear or non-linear function, and the socket layer is self-connected to an input of the implicit layer, so as to implement delay and storage of an output of the implicit layer.
In this embodiment, preferably, in S3, the number of input layer neuron and the number of output layer neuron are determined according to input and output parameters, and the number of hidden layer neuron is determined by an empirical method and a trial and error method, and the empirical theoretical value may be determined according to an empirical hidden layer determination rule:
wherein m is the number of neurons in the input layer, n is the number of neurons in the output layer, and r is the number of neurons in the hidden layer.
The basic Elman neural network algorithm for signal transmission and error correction of the Elman neural network consists of two parts, namely forward transmission of signals and backward propagation of errors, namely, the actual output is calculated according to the direction from input to output, and the correction process of weight values and threshold values of each layer is performed from the direction from output to input. The method comprises the steps of establishing an Elman neural network model, and establishing a mathematical model as follows
X(t)=f(w1·Xc(t)+w2U(t-1)+b1)
Xc(t)=X(t-1)
Y(t)=g(w3·X(t)+b2)
Wherein t is the current time, X (t) is the output value of the hidden layer, U (t-1) is the output value of the network at the previous time, and Xc(t) is the output of the acceptor layer, Y (t) is the output value of the prediction network, w1、w2、w3Respectively representing the connection weights of the hidden layer to the input layer, the input layer to the hidden layer and the bearer layer to the hidden layer, b1、b2Respectively being a threshold value in an implicit layer and a threshold value in an output layer, g (-) being a transfer function of an output neuron, being a linear combination of output of a middle layer, generally selecting a linear purelin function, f (-) being a transfer function of an implicit layer neuron, generally taking a Sigmoid function, namely:
the Elman neural network generally adopts a BP algorithm to correct the weight, and the learning index function adopts an error square sum function, and the expression is as follows:
where y (t) is the output value of the prediction network, and y (t) is the corresponding expected value.
In this embodiment, preferably, the genetic algorithm related parameters are set in S4 and combined with the neural network model, when encoding, the individual encoding mode is real number encoding, the value ranges of the weight w and the threshold b of the Elman neural network are set, a group of real number sets of individuals are selected as chromosomes by using an interpolation method, the chromosome gene encoding Elman neural network model parameter combination (w, b) is performed in a binary form, an initial population is randomly generated, the minimum model output error is used as a fitness function, the optimal individual is found through selection, intersection and variation operations, the initial weight and the threshold combination (w, b) of the neural network are determined, and then the Elman neural network model with the optimal performance is obtained, which mainly includes the following steps:
step 4-1: setting genetic algorithm-related parameters
Initializing the population, including the initial size M of the population, and the crossover probability PcProbability of mutation PmMaximum value G of evolution iteration numbermaxCurrent evolution iteration times g;
step 4-2: establishing a fitness function
And training the Elman neural network by taking the sum of absolute values of errors between the predicted output and the expected output as an individual fitness value, wherein the individual fitness value is calculated in the following mode:
Fi=k(Yi-yi)
wherein, YiIs the expected output value, y, of the ith neuron of the Elman neural networkiThe predicted output value of the ith neuron of the Elman neural network is shown, and k is a weighting coefficient.
Step 4-3: selection operation
The selection operation adopts a roulette method, and the individuals can be determined to enter the next generation according to the fitness of the individuals as a judgment standard, wherein the formula is as follows:
wherein i is the number of chromosomes, M is the size of the population, FiIs an individual fitness value, PiProbability of being selected for the individual.
Step 4-4: crossover operation
Crossover operations are the crossover of parts of chromosomes by pairwise crossing of randomly positioned chromosome strings, thereby generating new offspring individuals. The k-th chromosome akAnd the l-th chromosome alAnd in j bit interleaving operation, the expression of interleaving operation is as follows:
wherein, akjThe jth gene of the kth chromosome; a isljIs the jth gene of the ith individual; a iskj、aljTwo different genes in the same individual are respectively; pcFor the cross probability, is [0,1]]A random number in between;
and 4-5: mutation operation
The mutation randomly selects a plurality of individuals from the group with a certain probability, and then randomly selects a certain position for the selected individuals to carry out inverse operation. Selecting the jth gene a of the ith individualijCarrying out mutation by the following operation method:
f(g)=r(1-g/Gmax)2
wherein r is a random number between (0, 1); g is the current generation selection times; gmaxThe maximum number of evolutions; a ismax、aminEach represents a gene aijUpper and lower bounds of (a); pmIs a variation probability of [0,1]]Random number in between.
And 4-6: and repeating the steps 4-3-4-5 until the maximum iteration times is reached or the global optimum value meets the minimum adaptive value, obtaining an optimal individual, and obtaining an initial weight value and a threshold value combination (w, b) of the neural network through decoding to obtain a GA-Elman neural network model with optimal performance.
In this embodiment, preferably, in S5, a root mean square error indicator is used to analyze the prediction result, if the error is large, the training is performed again, and if the error is within an allowable range, the GA-Elman neural network is trained to be qualified, where the root mean square error formula is as follows:
wherein σMSEThe root mean square error is represented by i being 1,2, …, N being the number of predicted samples, y (i) being the true value of the ith sample of the Elman neural network,is the predicted result of the ith sample of the Elman neural network.
After the operation is finished, performing inverse normalization processing on the output value to obtain a credit scoring result, wherein the formula is as follows:
xi=(xmax-xmin)·yi+xmin
wherein x ismax、xminRespectively representing the maximum value and the minimum value of the input quantity of the training sample; x is the number ofi、yiThe values before and after normalization of the input samples are respectively.
In this embodiment, preferably, in S6, the credit scoring model of the genetic algorithm optimized Elman neural network is deployed to an application platform to output a real-time application credit score for implementing real-time approval of an application client, and in S6, performance data is periodically input to the model training for implementing online update of the model.
Example 2
The invention also provides a credit risk prediction system based on genetic algorithm optimization Elman neural network, which comprises a sample acquisition unit: the system comprises a training sample, a data acquisition module and a data processing module, wherein the training sample is used for acquiring personal application information, operation behavior buried point data and post-loan repayment performance as evaluation results;
a data processing unit: extracting the collected data characteristics, and performing data missing completion, abnormal value processing and normalization;
a model training unit: setting relevant parameters of a genetic algorithm, combining the relevant parameters with an Elman neural network model, and optimizing the initial weight and the threshold of the neural network to obtain a credit score prediction model of the GA-Elman neural network;
a prediction unit: and the Elman neural network used for training completion carries out credit risk prediction on the online application client.
The Elman neural network has the characteristics of dynamic and nonlinear mapping, and is particularly suitable for credit assessment and prediction of Internet finance; the method has the advantages that the method utilizes the nonlinear optimization capability of the genetic algorithm to carry out parameter improvement on the Elman neural network model, carries out optimization search in the global state to obtain the optimal weight threshold value, overcomes the defects that the traditional Elman convergence process is unstable, the convergence speed is low, the Elman neural network model is easy to fall into local optimization and the like, and improves the stability and the generalization capability of the Elman neural network.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (9)
1. A credit risk prediction method for optimizing an Elman neural network based on a genetic algorithm is characterized by comprising the following steps:
s1, collecting data, selecting a certain proportion and quantity of normal repayment and overdue customers as modeling samples according to the post-loan expression from the back end of an internet financial platform, and collecting personal basic information and monitoring software when a sample customer account is registered and applied to obtain operation behavior buried point data;
s2, preprocessing data, namely performing deletion completion, abnormal value processing and normalization processing on the acquired data, and then dividing the acquired data into a training set and a test set according to the proportion of 7: 3;
s3, determining an Elman neural network topological structure by utilizing the sample data of the training set;
s4, setting relevant parameters of a genetic algorithm, combining the relevant parameters with a neural network model, and obtaining a credit score prediction model of the GA-Elman neural network by using the initial weight and the threshold of the training set sample to the optimized neural network;
s5, inputting test set data into a GA-Elman neural network for testing the prediction performance, and comparing the test set data with an Elman neural network model optimized by a gradient descent method and a particle swarm algorithm;
and S6, deploying a credit scoring model of the genetic algorithm optimized Elman neural network to an application platform to output a real-time application credit score, and inputting the represented data to model training at regular intervals.
2. The method for predicting the credit risk based on the genetic algorithm optimized Elman neural network as claimed in claim 1, wherein: in the step S1, normal repayment and overdue customers in a certain proportion and quantity are selected as modeling samples from the back end of the internet financial platform according to the post-loan expression, and the personal basic information when the customer account registration application is acquired and the operation behavior buried point data is acquired from the monitoring software, wherein the personal basic information includes: the mobile phone number, the academic calendar, the marital status, the working unit, the address, the contact information, the personal basic information, the credit transaction information, the public information and the special record data which are acquired by the credit investigation report; the data of the buried point comprises equipment behavior data and log data which are acquired when the point is buried, wherein the equipment behavior data comprises: the number of times, the number of clicks, the click frequency, the total input time and the average time, the mobile phone number data, the GPS position, the MAC address, the IP address data, the geographic information application frequency, the IP application frequency, the equipment electric quantity ratio and the average acceleration of the gyroscope of logging on the platform, wherein the log data comprises: login times within 7 days, time from the first click to the application of credit authorization, the maximum number of sessions within one day, behavior statistics of the week before the application of credit authorization, mobile internet behavior data, behavior data in the loan APP, credit history and universe multi-dimensional big data including operator data.
3. The method for predicting the credit risk based on the genetic algorithm optimized Elman neural network as claimed in claim 1, wherein: the unit of the input quantity of the neural network in the S2 is different, the numerical value difference is large, the original data variable needs to be normalized before input training, so that the original data variable is in the same dimension, and the normalization includes normalization and inverse normalization.
4. The method for predicting the credit risk based on the genetic algorithm optimized Elman neural network as claimed in claim 1, wherein: the Elman neural network in S3 is a typical dynamic neural network, and the Elman neural network is based on a BP artificial neural network basic structure, and has a function of mapping dynamic features by storing internal states, so that a system has a capability of adapting to time-varying characteristics, and the Elman neural network includes an input layer, an implicit layer, an output layer, and a socket layer, neurons of the input layer are used for performing a signal transmission function, neurons of the output layer are used for performing a linear weighting function, an excitation function of the implicit layer selects a linear or non-linear function, and the socket layer is self-connected to an input of the implicit layer, so as to implement delay and storage of an output of the implicit layer.
5. The method for predicting the credit risk based on the genetic algorithm optimized Elman neural network as claimed in claim 4, wherein: in the step S3, the number of neurons in the input layer and the number of neurons in the output layer are determined according to the input/output parameters, and the number of neurons in the hidden layer is determined by an empirical method and a trial and error method.
6. The method for predicting the credit risk based on the genetic algorithm optimized Elman neural network as claimed in claim 1, wherein: setting genetic algorithm related parameters in S4, combining with a neural network model, setting a real number encoding mode during encoding, setting a value range of a weight w and a threshold b of the Elman neural network, selecting a group of real number sets of individuals as chromosomes by using an interpolation method, carrying out chromosome gene encoding Elman neural network model parameter combination (w, b) in a binary mode, randomly generating a primary population, searching for an optimal individual by selecting, crossing and mutating operations by taking a minimum model output error as a fitness function, and determining the initial weight and threshold combination (w, b) of the neural network so as to obtain the Elman neural network model with optimal performance.
7. The method for predicting the credit risk based on the genetic algorithm optimized Elman neural network as claimed in claim 1, wherein: and in the S5, the prediction result is analyzed by adopting a root mean square error index, if the error is larger, the training is carried out again, and if the error is in an allowable range, the GA-Elman neural network is trained to be qualified.
8. The method for predicting the credit risk based on the genetic algorithm optimized Elman neural network as claimed in claim 1, wherein: and in the S6, deploying a credit scoring model of the genetic algorithm optimization Elman neural network to an application platform to output a real-time application credit score for realizing real-time approval of an application client, and in the S6, periodically inputting performance data into the model for training to realize online updating of the model.
9. A credit risk prediction system based on genetic algorithm optimization Elman neural network is characterized in that: comprises a sample acquisition unit: the system comprises a training sample, a data acquisition module and a data processing module, wherein the training sample is used for acquiring personal application information, operation behavior buried point data and post-loan repayment performance as evaluation results;
a data processing unit: extracting the collected data characteristics, and performing data missing completion, abnormal value processing and normalization;
a model training unit: setting relevant parameters of a genetic algorithm, combining the relevant parameters with an Elman neural network model, and optimizing the initial weight and the threshold of the neural network to obtain a credit score prediction model of the GA-Elman neural network;
a prediction unit: and the Elman neural network used for training completion carries out credit risk prediction on the online application client.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011049256.3A CN112330435A (en) | 2020-09-29 | 2020-09-29 | Credit risk prediction method and system for optimizing Elman neural network based on genetic algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011049256.3A CN112330435A (en) | 2020-09-29 | 2020-09-29 | Credit risk prediction method and system for optimizing Elman neural network based on genetic algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112330435A true CN112330435A (en) | 2021-02-05 |
Family
ID=74313031
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011049256.3A Pending CN112330435A (en) | 2020-09-29 | 2020-09-29 | Credit risk prediction method and system for optimizing Elman neural network based on genetic algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112330435A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106951983A (en) * | 2017-02-27 | 2017-07-14 | 浙江工业大学 | Injector performance Forecasting Methodology based on the artificial neural network using many parent genetic algorithms |
CN107481135A (en) * | 2017-08-16 | 2017-12-15 | 广东工业大学 | A kind of personal credit evaluation method and system based on BP neural network |
CN108090658A (en) * | 2017-12-06 | 2018-05-29 | 河北工业大学 | Arc fault diagnostic method based on time domain charactreristic parameter fusion |
CN108665095A (en) * | 2018-04-27 | 2018-10-16 | 东华大学 | Short term power prediction technique based on genetic algorithm optimization Elman neural networks |
CN110516954A (en) * | 2019-08-23 | 2019-11-29 | 昆明理工大学 | One kind referring to calibration method based on GA-BP neural network algorithm optimization mineral processing production |
CN110633504A (en) * | 2019-08-21 | 2019-12-31 | 中联煤层气有限责任公司 | Prediction method for coal bed gas permeability |
-
2020
- 2020-09-29 CN CN202011049256.3A patent/CN112330435A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106951983A (en) * | 2017-02-27 | 2017-07-14 | 浙江工业大学 | Injector performance Forecasting Methodology based on the artificial neural network using many parent genetic algorithms |
CN107481135A (en) * | 2017-08-16 | 2017-12-15 | 广东工业大学 | A kind of personal credit evaluation method and system based on BP neural network |
CN108090658A (en) * | 2017-12-06 | 2018-05-29 | 河北工业大学 | Arc fault diagnostic method based on time domain charactreristic parameter fusion |
CN108665095A (en) * | 2018-04-27 | 2018-10-16 | 东华大学 | Short term power prediction technique based on genetic algorithm optimization Elman neural networks |
CN110633504A (en) * | 2019-08-21 | 2019-12-31 | 中联煤层气有限责任公司 | Prediction method for coal bed gas permeability |
CN110516954A (en) * | 2019-08-23 | 2019-11-29 | 昆明理工大学 | One kind referring to calibration method based on GA-BP neural network algorithm optimization mineral processing production |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112581263A (en) | Credit evaluation method for optimizing generalized regression neural network based on wolf algorithm | |
CN112906982A (en) | GNN-LSTM combination-based network flow prediction method | |
CN112037012A (en) | Internet financial credit evaluation method based on PSO-BP neural network | |
CN110059887B (en) | BP neural network risk identification method and system based on adaptive genetic algorithm | |
CN113538125A (en) | Risk rating method for optimizing Hopfield neural network based on firefly algorithm | |
CN103105246A (en) | Greenhouse environment forecasting feedback method of back propagation (BP) neural network based on improvement of genetic algorithm | |
CN112215446A (en) | Neural network-based unit dynamic fire risk assessment method | |
CN107346459B (en) | Multi-mode pollutant integrated forecasting method based on genetic algorithm improvement | |
CN112308288A (en) | Particle swarm optimization LSSVM-based default user probability prediction method | |
CN110751318A (en) | IPSO-LSTM-based ultra-short-term power load prediction method | |
CN112634018A (en) | Overdue monitoring method for optimizing recurrent neural network based on ant colony algorithm | |
CN112529685A (en) | Loan user credit rating method and system based on BAS-FNN | |
CN112529683A (en) | Method and system for evaluating credit risk of customer based on CS-PNN | |
CN112348655A (en) | Credit evaluation method based on AFSA-ELM | |
CN112037011A (en) | Credit scoring method based on FOA-RBF neural network | |
CN112581264A (en) | Grasshopper algorithm-based credit risk prediction method for optimizing MLP neural network | |
CN112634019A (en) | Default probability prediction method for optimizing grey neural network based on bacterial foraging algorithm | |
CN112883632A (en) | Lithium battery equivalent circuit model parameter identification method based on improved ant colony algorithm | |
CN110310199B (en) | Method and system for constructing loan risk prediction model and loan risk prediction method | |
CN113379536A (en) | Default probability prediction method for optimizing recurrent neural network based on gravity search algorithm | |
CN110298506A (en) | A kind of urban construction horizontal forecast system | |
CN111988786B (en) | Sensor network covering method and system based on high-dimensional multi-target decomposition algorithm | |
CN112529684A (en) | Customer credit assessment method and system based on FWA _ DBN | |
CN112330435A (en) | Credit risk prediction method and system for optimizing Elman neural network based on genetic algorithm | |
CN111414927A (en) | Method for evaluating seawater quality |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210205 |