CN112330435A - Credit risk prediction method and system for optimizing Elman neural network based on genetic algorithm - Google Patents

Credit risk prediction method and system for optimizing Elman neural network based on genetic algorithm Download PDF

Info

Publication number
CN112330435A
CN112330435A CN202011049256.3A CN202011049256A CN112330435A CN 112330435 A CN112330435 A CN 112330435A CN 202011049256 A CN202011049256 A CN 202011049256A CN 112330435 A CN112330435 A CN 112330435A
Authority
CN
China
Prior art keywords
neural network
data
elman neural
credit
genetic algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011049256.3A
Other languages
Chinese (zh)
Inventor
江远强
李兰
韩璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baiweijinke Shanghai Information Technology Co ltd
Original Assignee
Baiweijinke Shanghai Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baiweijinke Shanghai Information Technology Co ltd filed Critical Baiweijinke Shanghai Information Technology Co ltd
Priority to CN202011049256.3A priority Critical patent/CN112330435A/en
Publication of CN112330435A publication Critical patent/CN112330435A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Strategic Management (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Accounting & Taxation (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Genetics & Genomics (AREA)
  • Physiology (AREA)
  • Educational Administration (AREA)
  • Technology Law (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)

Abstract

The invention discloses a credit risk prediction method and a credit risk prediction system based on genetic algorithm optimization Elman neural network, which comprises the following steps: s1, collecting data, selecting a certain proportion and quantity of normal repayment and overdue customers as modeling samples according to the post-loan expression from the back end of an internet financial platform, and collecting personal basic information and monitoring software when a sample customer account is registered and applied to obtain operation behavior buried point data; s2, preprocessing data, namely performing deletion completion, abnormal value processing and normalization processing on the acquired data, and then dividing the acquired data into a training set and a test set according to the proportion of 7: 3; s3, determining an Elman neural network topological structure by utilizing the sample data of the training set; s4, setting relevant parameters of a genetic algorithm, combining the relevant parameters with a neural network model, and utilizing an initial weight and a threshold value of a training set sample to the optimized neural network; the Elman neural network has the characteristics of dynamic and nonlinear mapping, and is particularly suitable for credit assessment prediction of Internet finance.

Description

Credit risk prediction method and system for optimizing Elman neural network based on genetic algorithm
Technical Field
The invention belongs to the technical field of wind control in the Internet financial industry, and particularly relates to a credit risk prediction method and system for optimizing an Elman neural network based on a genetic algorithm.
Background
In recent years, artificial neural networks have proven to be a well-behaved research model in internet financial credit assessment. An advantage of the artificial neural network model to predict, discover and summarize the structure of financial variables is that it does not rely on specific assumptions. Most currently used in credit evaluation applications are the BP neural network and the RBF neural network, or some improvement based on both networks. However, these two neural networks have various disadvantages in processing data for prediction: the BP network is based on a gradient descent algorithm and has the defects of local minimum, low robustness and the like; the RBF neural network belongs to a static feedforward network, has defects in processing the dynamic time modeling problem, and cannot well meet the requirement of mutual credit evaluation.
The Elman neural network is a neural network with feedback, and a layer of supporting layer is added on the basis of a BP neural network and used for storing the output of a hidden layer at the previous moment and calculating time delay data, so that the Elman neural network has the capability of dynamic storage. Through data training, the data training has the characteristics of dynamic and nonlinear mapping and is suitable for the prediction problem of time series data, while financial data is taken as typical time series data, and the Elman neural network is particularly suitable for credit assessment prediction of Internet finance.
Although the performance of the Elman neural network is improved for the traditional neural network, in the design process, the problems of selection and optimization of a training algorithm, a transmission function, a network structure, a hidden layer connection weight value, a threshold value and the like still exist, in the prior art, gradient descent methods, particle swarm algorithms or simulated annealing algorithms and the like are used for optimizing parameters of the Elman neural network, but the Elman neural network has the defects of unstable convergence process, low convergence speed, easiness in falling into local optimum and the like, so that a more appropriate optimization algorithm needs to be adopted, and therefore a credit risk prediction method and a credit risk prediction system based on genetic algorithm optimization for the Elman neural network are provided.
Disclosure of Invention
The invention aims to provide a credit risk prediction method and a credit risk prediction system based on genetic algorithm optimization Elman neural network, so as to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme: a credit risk prediction method for optimizing an Elman neural network based on a genetic algorithm comprises the following steps:
s1, collecting data, selecting a certain proportion and quantity of normal repayment and overdue customers as modeling samples according to the post-loan expression from the back end of an internet financial platform, and collecting personal basic information and monitoring software when a sample customer account is registered and applied to obtain operation behavior buried point data;
s2, preprocessing data, namely performing deletion completion, abnormal value processing and normalization processing on the acquired data, and then dividing the acquired data into a training set and a test set according to the proportion of 7: 3;
s3, determining an Elman neural network topological structure by utilizing the sample data of the training set;
s4, setting relevant parameters of a genetic algorithm, combining the relevant parameters with a neural network model, and obtaining a credit score prediction model of the GA-Elman neural network by using the initial weight and the threshold of the training set sample to the optimized neural network;
s5, inputting test set data into a GA-Elman neural network for testing the prediction performance, and comparing the test set data with an Elman neural network model optimized by a gradient descent method and a particle swarm algorithm;
and S6, deploying a credit scoring model of the genetic algorithm optimized Elman neural network to an application platform to output a real-time application credit score, and inputting the represented data to model training at regular intervals.
Preferably, in S1, a certain proportion and quantity of normal repayment and overdue customers are selected as modeling samples from the back end of the internet financial platform according to the post-loan performance, and personal basic information when a sample customer account registration application is acquired and operation behavior buried point data is acquired from monitoring software, where the personal basic information includes: the mobile phone number, the academic calendar, the marital status, the working unit, the address, the contact information, the personal basic information, the credit transaction information, the public information and the special record data which are acquired by the credit investigation report; the data of the buried point comprises equipment behavior data and log data which are acquired when the point is buried, wherein the equipment behavior data comprises: the number of times, the number of clicks, the click frequency, the total input time and the average time, the mobile phone number data, the GPS position, the MAC address, the IP address data, the geographic information application frequency, the IP application frequency, the equipment electric quantity ratio and the average acceleration of the gyroscope of logging on the platform, wherein the log data comprises: login times within 7 days, time from the first click to the application of credit authorization, the maximum number of sessions within one day, behavior statistics of the week before the application of credit authorization, mobile internet behavior data, behavior data in the loan APP, credit history and universe multi-dimensional big data including operator data.
Preferably, in S2, the input quantities of the neural networks have different units and have larger numerical differences, and before input training, normalization processing needs to be performed on the original data variables to make them in the same dimension, where the normalization processing includes normalization processing and inverse normalization processing.
Preferably, the Elman neural network in S3 is a typical dynamic neuron network, and the Elman neural network is based on the basic structure of the BP artificial neural network, and has a function of mapping dynamic features by storing internal states, so that the system has the capability of adapting to time-varying characteristics, and the Elman neural network includes an input layer, an implicit layer, an output layer and a socket layer, neurons of the input layer are used for performing signal transmission, neurons of the output layer are used for performing linear weighting, an excitation function of the implicit layer selects a linear or non-linear function, and the socket layer is self-connected to an input of the implicit layer, so as to implement delay and storage of an output of the implicit layer.
Preferably, in S3, the number of input layer neuron and the number of output layer neuron are determined according to the input/output parameters, and the number of hidden layer neuron is determined by an empirical method and a trial and error method.
Preferably, the genetic algorithm related parameters are set in S4 and combined with the neural network model, the individual coding mode is real number coding during coding, the value ranges of the weight w and the threshold b of the Elman neural network are set, a group of real number sets of individuals are selected as chromosomes by using an interpolation method, the chromosome gene coding Elman neural network model parameter combination (w, b) is performed in a binary form, an initial population is randomly generated, the minimum model output error is used as a fitness function, the optimal individual is found through selection, intersection and variation operations, and the initial weight and threshold combination (w, b) of the neural network is determined, so that the Elman neural network model with the optimal performance is obtained.
Preferably, in S5, the root mean square error indicator is used to analyze the prediction result, if the error is large, the training is performed again, and if the error is within an allowable range, the GA-Elman neural network is trained to be qualified.
Preferably, in S6, the credit scoring model of the genetic algorithm optimization Elman neural network is deployed to an application platform to output a real-time application credit score for realizing real-time approval of an application client, and in S6, performance data is periodically input to model training for realizing online updating of the model.
The invention also provides a credit risk prediction system based on genetic algorithm optimization Elman neural network, which comprises a sample acquisition unit: the system comprises a training sample, a data acquisition module and a data processing module, wherein the training sample is used for acquiring personal application information, operation behavior buried point data and post-loan repayment performance as evaluation results;
a data processing unit: extracting the collected data characteristics, and performing data missing completion, abnormal value processing and normalization;
a model training unit: setting relevant parameters of a genetic algorithm, combining the relevant parameters with an Elman neural network model, and optimizing the initial weight and the threshold of the neural network to obtain a credit score prediction model of the GA-Elman neural network;
a prediction unit: and the Elman neural network used for training completion carries out credit risk prediction on the online application client.
Compared with the prior art, the invention has the beneficial effects that:
(1) the Elman neural network has the characteristics of dynamic and nonlinear mapping, and is particularly suitable for credit assessment prediction of Internet finance.
(2) The method utilizes the nonlinear optimization capability of the genetic algorithm to automatically adjust the parameters of the Elman neural network model, carries out optimization search globally to obtain the optimal weight threshold, overcomes the defects of unstable convergence process, low convergence speed, easy falling into local optimization and the like of the traditional Elman convergence process, and improves the stability and generalization capability of the Elman neural network.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
Referring to fig. 1, the present invention provides a technical solution: a credit risk prediction method for optimizing an Elman neural network based on a genetic algorithm comprises the following steps:
s1, collecting data, selecting a certain proportion and quantity of normal repayment and overdue customers as modeling samples according to the post-loan expression from the back end of an internet financial platform, and collecting personal basic information and monitoring software when a sample customer account is registered and applied to obtain operation behavior buried point data;
s2, preprocessing data, namely performing deletion completion, abnormal value processing and normalization processing on the acquired data, and then dividing the acquired data into a training set and a test set according to the proportion of 7: 3;
s3, determining an Elman neural network topological structure by utilizing the sample data of the training set, wherein the Elman neural network topological structure comprises the neuron numbers of an input layer, an output layer and a hidden layer of the network, the number of the hidden layers, the weight of an initialized neural network and a threshold;
s4, setting relevant parameters of a genetic algorithm, combining the relevant parameters with a neural network model, and obtaining a credit score prediction model of the GA-Elman neural network by using the initial weight and the threshold of the training set sample to the optimized neural network;
s5, inputting test set data into a GA-Elman neural network for testing the prediction performance, and comparing the test set data with an Elman neural network model optimized by a gradient descent method and a particle swarm algorithm;
and S6, deploying a credit scoring model of the genetic algorithm optimized Elman neural network to an application platform to output a real-time application credit score, and inputting the represented data to model training at regular intervals.
In this embodiment, preferably, in S1, a certain proportion and a certain number of normal repayment and overdue customers are selected as modeling samples from the back end of the internet financial platform according to the post-loan performance, personal basic information at the time of a sample customer account registration application is acquired, and operation behavior buried point data is acquired from monitoring software, where the personal basic information includes: the mobile phone number, the academic calendar, the marital status, the working unit, the address, the contact information, the personal basic information, the credit transaction information, the public information and the special record data which are acquired by the credit investigation report; the data of the buried point comprises equipment behavior data and log data which are acquired when the point is buried, wherein the equipment behavior data comprises: the number of times, the number of clicks, the click frequency, the total input time and the average time, the mobile phone number data, the GPS position, the MAC address, the IP address data, the geographic information application frequency, the IP application frequency, the equipment electric quantity ratio and the average acceleration of the gyroscope of logging on the platform, wherein the log data comprises: login times within 7 days, time from the first click to the application of credit authorization, the maximum number of sessions within one day, behavior statistics of the week before the application of credit authorization, mobile internet behavior data, behavior data in the loan APP, credit history and universe multi-dimensional big data including operator data.
In this embodiment, preferably, in S2, the input quantities of the neural networks have different units and have larger numerical differences, and before input training, normalization processing needs to be performed on the original data variables to make them in the same dimension, where the normalization processing includes normalization processing and inverse normalization processing, and the expressions are respectively as follows:
the expression of the normalization process is:
x=(xmax-xmin)/2+(xmax+xmin)/2
the denormalization processing expression is:
xi=(xmax-xmin)·yi+xmin
wherein x ismax、xminRespectively representing the maximum value and the minimum value of the input quantity of the training sample; x is the number ofi、yiThe values before and after normalization of the input samples are respectively.
The original data is mapped to a [0,1] interval through normalization processing, so that the influence of original variables caused by different dimensions and large numerical value difference is effectively eliminated, and the predicted value obtained by the model is finally restored through inverse normalization processing to obtain a real numerical value.
In this embodiment, preferably, the Elman neural network in S3 is a typical dynamic neuron network, and the Elman neural network is based on a BP artificial neural network basic structure, and has a function of mapping dynamic features by storing internal states, so that a system has a capability of adapting to time-varying characteristics, and the Elman neural network includes an input layer, an implicit layer, an output layer, and a socket layer, neurons of the input layer are used for performing a signal transmission function, neurons of the output layer are used for performing a linear weighting function, an excitation function of the implicit layer selects a linear or non-linear function, and the socket layer is self-connected to an input of the implicit layer, so as to implement delay and storage of an output of the implicit layer.
In this embodiment, preferably, in S3, the number of input layer neuron and the number of output layer neuron are determined according to input and output parameters, and the number of hidden layer neuron is determined by an empirical method and a trial and error method, and the empirical theoretical value may be determined according to an empirical hidden layer determination rule:
Figure BDA0002709032870000071
wherein m is the number of neurons in the input layer, n is the number of neurons in the output layer, and r is the number of neurons in the hidden layer.
The basic Elman neural network algorithm for signal transmission and error correction of the Elman neural network consists of two parts, namely forward transmission of signals and backward propagation of errors, namely, the actual output is calculated according to the direction from input to output, and the correction process of weight values and threshold values of each layer is performed from the direction from output to input. The method comprises the steps of establishing an Elman neural network model, and establishing a mathematical model as follows
X(t)=f(w1·Xc(t)+w2U(t-1)+b1)
Xc(t)=X(t-1)
Y(t)=g(w3·X(t)+b2)
Wherein t is the current time, X (t) is the output value of the hidden layer, U (t-1) is the output value of the network at the previous time, and Xc(t) is the output of the acceptor layer, Y (t) is the output value of the prediction network, w1、w2、w3Respectively representing the connection weights of the hidden layer to the input layer, the input layer to the hidden layer and the bearer layer to the hidden layer, b1、b2Respectively being a threshold value in an implicit layer and a threshold value in an output layer, g (-) being a transfer function of an output neuron, being a linear combination of output of a middle layer, generally selecting a linear purelin function, f (-) being a transfer function of an implicit layer neuron, generally taking a Sigmoid function, namely:
Figure BDA0002709032870000072
the Elman neural network generally adopts a BP algorithm to correct the weight, and the learning index function adopts an error square sum function, and the expression is as follows:
Figure BDA0002709032870000073
where y (t) is the output value of the prediction network, and y (t) is the corresponding expected value.
In this embodiment, preferably, the genetic algorithm related parameters are set in S4 and combined with the neural network model, when encoding, the individual encoding mode is real number encoding, the value ranges of the weight w and the threshold b of the Elman neural network are set, a group of real number sets of individuals are selected as chromosomes by using an interpolation method, the chromosome gene encoding Elman neural network model parameter combination (w, b) is performed in a binary form, an initial population is randomly generated, the minimum model output error is used as a fitness function, the optimal individual is found through selection, intersection and variation operations, the initial weight and the threshold combination (w, b) of the neural network are determined, and then the Elman neural network model with the optimal performance is obtained, which mainly includes the following steps:
step 4-1: setting genetic algorithm-related parameters
Initializing the population, including the initial size M of the population, and the crossover probability PcProbability of mutation PmMaximum value G of evolution iteration numbermaxCurrent evolution iteration times g;
step 4-2: establishing a fitness function
And training the Elman neural network by taking the sum of absolute values of errors between the predicted output and the expected output as an individual fitness value, wherein the individual fitness value is calculated in the following mode:
Fi=k(Yi-yi)
wherein, YiIs the expected output value, y, of the ith neuron of the Elman neural networkiThe predicted output value of the ith neuron of the Elman neural network is shown, and k is a weighting coefficient.
Step 4-3: selection operation
The selection operation adopts a roulette method, and the individuals can be determined to enter the next generation according to the fitness of the individuals as a judgment standard, wherein the formula is as follows:
Figure BDA0002709032870000081
wherein i is the number of chromosomes, M is the size of the population, FiIs an individual fitness value, PiProbability of being selected for the individual.
Step 4-4: crossover operation
Crossover operations are the crossover of parts of chromosomes by pairwise crossing of randomly positioned chromosome strings, thereby generating new offspring individuals. The k-th chromosome akAnd the l-th chromosome alAnd in j bit interleaving operation, the expression of interleaving operation is as follows:
Figure BDA0002709032870000091
wherein, akjThe jth gene of the kth chromosome; a isljIs the jth gene of the ith individual; a iskj、aljTwo different genes in the same individual are respectively; pcFor the cross probability, is [0,1]]A random number in between;
and 4-5: mutation operation
The mutation randomly selects a plurality of individuals from the group with a certain probability, and then randomly selects a certain position for the selected individuals to carry out inverse operation. Selecting the jth gene a of the ith individualijCarrying out mutation by the following operation method:
f(g)=r(1-g/Gmax)2
Figure BDA0002709032870000092
wherein r is a random number between (0, 1); g is the current generation selection times; gmaxThe maximum number of evolutions; a ismax、aminEach represents a gene aijUpper and lower bounds of (a); pmIs a variation probability of [0,1]]Random number in between.
And 4-6: and repeating the steps 4-3-4-5 until the maximum iteration times is reached or the global optimum value meets the minimum adaptive value, obtaining an optimal individual, and obtaining an initial weight value and a threshold value combination (w, b) of the neural network through decoding to obtain a GA-Elman neural network model with optimal performance.
In this embodiment, preferably, in S5, a root mean square error indicator is used to analyze the prediction result, if the error is large, the training is performed again, and if the error is within an allowable range, the GA-Elman neural network is trained to be qualified, where the root mean square error formula is as follows:
Figure BDA0002709032870000093
wherein σMSEThe root mean square error is represented by i being 1,2, …, N being the number of predicted samples, y (i) being the true value of the ith sample of the Elman neural network,
Figure BDA0002709032870000094
is the predicted result of the ith sample of the Elman neural network.
After the operation is finished, performing inverse normalization processing on the output value to obtain a credit scoring result, wherein the formula is as follows:
xi=(xmax-xmin)·yi+xmin
wherein x ismax、xminRespectively representing the maximum value and the minimum value of the input quantity of the training sample; x is the number ofi、yiThe values before and after normalization of the input samples are respectively.
In this embodiment, preferably, in S6, the credit scoring model of the genetic algorithm optimized Elman neural network is deployed to an application platform to output a real-time application credit score for implementing real-time approval of an application client, and in S6, performance data is periodically input to the model training for implementing online update of the model.
Example 2
The invention also provides a credit risk prediction system based on genetic algorithm optimization Elman neural network, which comprises a sample acquisition unit: the system comprises a training sample, a data acquisition module and a data processing module, wherein the training sample is used for acquiring personal application information, operation behavior buried point data and post-loan repayment performance as evaluation results;
a data processing unit: extracting the collected data characteristics, and performing data missing completion, abnormal value processing and normalization;
a model training unit: setting relevant parameters of a genetic algorithm, combining the relevant parameters with an Elman neural network model, and optimizing the initial weight and the threshold of the neural network to obtain a credit score prediction model of the GA-Elman neural network;
a prediction unit: and the Elman neural network used for training completion carries out credit risk prediction on the online application client.
The Elman neural network has the characteristics of dynamic and nonlinear mapping, and is particularly suitable for credit assessment and prediction of Internet finance; the method has the advantages that the method utilizes the nonlinear optimization capability of the genetic algorithm to carry out parameter improvement on the Elman neural network model, carries out optimization search in the global state to obtain the optimal weight threshold value, overcomes the defects that the traditional Elman convergence process is unstable, the convergence speed is low, the Elman neural network model is easy to fall into local optimization and the like, and improves the stability and the generalization capability of the Elman neural network.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (9)

1. A credit risk prediction method for optimizing an Elman neural network based on a genetic algorithm is characterized by comprising the following steps:
s1, collecting data, selecting a certain proportion and quantity of normal repayment and overdue customers as modeling samples according to the post-loan expression from the back end of an internet financial platform, and collecting personal basic information and monitoring software when a sample customer account is registered and applied to obtain operation behavior buried point data;
s2, preprocessing data, namely performing deletion completion, abnormal value processing and normalization processing on the acquired data, and then dividing the acquired data into a training set and a test set according to the proportion of 7: 3;
s3, determining an Elman neural network topological structure by utilizing the sample data of the training set;
s4, setting relevant parameters of a genetic algorithm, combining the relevant parameters with a neural network model, and obtaining a credit score prediction model of the GA-Elman neural network by using the initial weight and the threshold of the training set sample to the optimized neural network;
s5, inputting test set data into a GA-Elman neural network for testing the prediction performance, and comparing the test set data with an Elman neural network model optimized by a gradient descent method and a particle swarm algorithm;
and S6, deploying a credit scoring model of the genetic algorithm optimized Elman neural network to an application platform to output a real-time application credit score, and inputting the represented data to model training at regular intervals.
2. The method for predicting the credit risk based on the genetic algorithm optimized Elman neural network as claimed in claim 1, wherein: in the step S1, normal repayment and overdue customers in a certain proportion and quantity are selected as modeling samples from the back end of the internet financial platform according to the post-loan expression, and the personal basic information when the customer account registration application is acquired and the operation behavior buried point data is acquired from the monitoring software, wherein the personal basic information includes: the mobile phone number, the academic calendar, the marital status, the working unit, the address, the contact information, the personal basic information, the credit transaction information, the public information and the special record data which are acquired by the credit investigation report; the data of the buried point comprises equipment behavior data and log data which are acquired when the point is buried, wherein the equipment behavior data comprises: the number of times, the number of clicks, the click frequency, the total input time and the average time, the mobile phone number data, the GPS position, the MAC address, the IP address data, the geographic information application frequency, the IP application frequency, the equipment electric quantity ratio and the average acceleration of the gyroscope of logging on the platform, wherein the log data comprises: login times within 7 days, time from the first click to the application of credit authorization, the maximum number of sessions within one day, behavior statistics of the week before the application of credit authorization, mobile internet behavior data, behavior data in the loan APP, credit history and universe multi-dimensional big data including operator data.
3. The method for predicting the credit risk based on the genetic algorithm optimized Elman neural network as claimed in claim 1, wherein: the unit of the input quantity of the neural network in the S2 is different, the numerical value difference is large, the original data variable needs to be normalized before input training, so that the original data variable is in the same dimension, and the normalization includes normalization and inverse normalization.
4. The method for predicting the credit risk based on the genetic algorithm optimized Elman neural network as claimed in claim 1, wherein: the Elman neural network in S3 is a typical dynamic neural network, and the Elman neural network is based on a BP artificial neural network basic structure, and has a function of mapping dynamic features by storing internal states, so that a system has a capability of adapting to time-varying characteristics, and the Elman neural network includes an input layer, an implicit layer, an output layer, and a socket layer, neurons of the input layer are used for performing a signal transmission function, neurons of the output layer are used for performing a linear weighting function, an excitation function of the implicit layer selects a linear or non-linear function, and the socket layer is self-connected to an input of the implicit layer, so as to implement delay and storage of an output of the implicit layer.
5. The method for predicting the credit risk based on the genetic algorithm optimized Elman neural network as claimed in claim 4, wherein: in the step S3, the number of neurons in the input layer and the number of neurons in the output layer are determined according to the input/output parameters, and the number of neurons in the hidden layer is determined by an empirical method and a trial and error method.
6. The method for predicting the credit risk based on the genetic algorithm optimized Elman neural network as claimed in claim 1, wherein: setting genetic algorithm related parameters in S4, combining with a neural network model, setting a real number encoding mode during encoding, setting a value range of a weight w and a threshold b of the Elman neural network, selecting a group of real number sets of individuals as chromosomes by using an interpolation method, carrying out chromosome gene encoding Elman neural network model parameter combination (w, b) in a binary mode, randomly generating a primary population, searching for an optimal individual by selecting, crossing and mutating operations by taking a minimum model output error as a fitness function, and determining the initial weight and threshold combination (w, b) of the neural network so as to obtain the Elman neural network model with optimal performance.
7. The method for predicting the credit risk based on the genetic algorithm optimized Elman neural network as claimed in claim 1, wherein: and in the S5, the prediction result is analyzed by adopting a root mean square error index, if the error is larger, the training is carried out again, and if the error is in an allowable range, the GA-Elman neural network is trained to be qualified.
8. The method for predicting the credit risk based on the genetic algorithm optimized Elman neural network as claimed in claim 1, wherein: and in the S6, deploying a credit scoring model of the genetic algorithm optimization Elman neural network to an application platform to output a real-time application credit score for realizing real-time approval of an application client, and in the S6, periodically inputting performance data into the model for training to realize online updating of the model.
9. A credit risk prediction system based on genetic algorithm optimization Elman neural network is characterized in that: comprises a sample acquisition unit: the system comprises a training sample, a data acquisition module and a data processing module, wherein the training sample is used for acquiring personal application information, operation behavior buried point data and post-loan repayment performance as evaluation results;
a data processing unit: extracting the collected data characteristics, and performing data missing completion, abnormal value processing and normalization;
a model training unit: setting relevant parameters of a genetic algorithm, combining the relevant parameters with an Elman neural network model, and optimizing the initial weight and the threshold of the neural network to obtain a credit score prediction model of the GA-Elman neural network;
a prediction unit: and the Elman neural network used for training completion carries out credit risk prediction on the online application client.
CN202011049256.3A 2020-09-29 2020-09-29 Credit risk prediction method and system for optimizing Elman neural network based on genetic algorithm Pending CN112330435A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011049256.3A CN112330435A (en) 2020-09-29 2020-09-29 Credit risk prediction method and system for optimizing Elman neural network based on genetic algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011049256.3A CN112330435A (en) 2020-09-29 2020-09-29 Credit risk prediction method and system for optimizing Elman neural network based on genetic algorithm

Publications (1)

Publication Number Publication Date
CN112330435A true CN112330435A (en) 2021-02-05

Family

ID=74313031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011049256.3A Pending CN112330435A (en) 2020-09-29 2020-09-29 Credit risk prediction method and system for optimizing Elman neural network based on genetic algorithm

Country Status (1)

Country Link
CN (1) CN112330435A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951983A (en) * 2017-02-27 2017-07-14 浙江工业大学 Injector performance Forecasting Methodology based on the artificial neural network using many parent genetic algorithms
CN107481135A (en) * 2017-08-16 2017-12-15 广东工业大学 A kind of personal credit evaluation method and system based on BP neural network
CN108090658A (en) * 2017-12-06 2018-05-29 河北工业大学 Arc fault diagnostic method based on time domain charactreristic parameter fusion
CN108665095A (en) * 2018-04-27 2018-10-16 东华大学 Short term power prediction technique based on genetic algorithm optimization Elman neural networks
CN110516954A (en) * 2019-08-23 2019-11-29 昆明理工大学 One kind referring to calibration method based on GA-BP neural network algorithm optimization mineral processing production
CN110633504A (en) * 2019-08-21 2019-12-31 中联煤层气有限责任公司 Prediction method for coal bed gas permeability

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951983A (en) * 2017-02-27 2017-07-14 浙江工业大学 Injector performance Forecasting Methodology based on the artificial neural network using many parent genetic algorithms
CN107481135A (en) * 2017-08-16 2017-12-15 广东工业大学 A kind of personal credit evaluation method and system based on BP neural network
CN108090658A (en) * 2017-12-06 2018-05-29 河北工业大学 Arc fault diagnostic method based on time domain charactreristic parameter fusion
CN108665095A (en) * 2018-04-27 2018-10-16 东华大学 Short term power prediction technique based on genetic algorithm optimization Elman neural networks
CN110633504A (en) * 2019-08-21 2019-12-31 中联煤层气有限责任公司 Prediction method for coal bed gas permeability
CN110516954A (en) * 2019-08-23 2019-11-29 昆明理工大学 One kind referring to calibration method based on GA-BP neural network algorithm optimization mineral processing production

Similar Documents

Publication Publication Date Title
CN112581263A (en) Credit evaluation method for optimizing generalized regression neural network based on wolf algorithm
CN112906982A (en) GNN-LSTM combination-based network flow prediction method
CN112037012A (en) Internet financial credit evaluation method based on PSO-BP neural network
CN110059887B (en) BP neural network risk identification method and system based on adaptive genetic algorithm
CN113538125A (en) Risk rating method for optimizing Hopfield neural network based on firefly algorithm
CN103105246A (en) Greenhouse environment forecasting feedback method of back propagation (BP) neural network based on improvement of genetic algorithm
CN112215446A (en) Neural network-based unit dynamic fire risk assessment method
CN107346459B (en) Multi-mode pollutant integrated forecasting method based on genetic algorithm improvement
CN112308288A (en) Particle swarm optimization LSSVM-based default user probability prediction method
CN110751318A (en) IPSO-LSTM-based ultra-short-term power load prediction method
CN112634018A (en) Overdue monitoring method for optimizing recurrent neural network based on ant colony algorithm
CN112529685A (en) Loan user credit rating method and system based on BAS-FNN
CN112529683A (en) Method and system for evaluating credit risk of customer based on CS-PNN
CN112348655A (en) Credit evaluation method based on AFSA-ELM
CN112037011A (en) Credit scoring method based on FOA-RBF neural network
CN112581264A (en) Grasshopper algorithm-based credit risk prediction method for optimizing MLP neural network
CN112634019A (en) Default probability prediction method for optimizing grey neural network based on bacterial foraging algorithm
CN112883632A (en) Lithium battery equivalent circuit model parameter identification method based on improved ant colony algorithm
CN110310199B (en) Method and system for constructing loan risk prediction model and loan risk prediction method
CN113379536A (en) Default probability prediction method for optimizing recurrent neural network based on gravity search algorithm
CN110298506A (en) A kind of urban construction horizontal forecast system
CN111988786B (en) Sensor network covering method and system based on high-dimensional multi-target decomposition algorithm
CN112529684A (en) Customer credit assessment method and system based on FWA _ DBN
CN112330435A (en) Credit risk prediction method and system for optimizing Elman neural network based on genetic algorithm
CN111414927A (en) Method for evaluating seawater quality

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210205