CN115660820A - Loan risk model construction and prediction method - Google Patents
Loan risk model construction and prediction method Download PDFInfo
- Publication number
- CN115660820A CN115660820A CN202211267567.6A CN202211267567A CN115660820A CN 115660820 A CN115660820 A CN 115660820A CN 202211267567 A CN202211267567 A CN 202211267567A CN 115660820 A CN115660820 A CN 115660820A
- Authority
- CN
- China
- Prior art keywords
- loan
- data
- risk
- user
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
The invention provides a loan risk model building and predicting method, which obtains loan index data and expected loan amount of a user by analyzing basic data of loan users, builds a big data network by analyzing the loan data of a plurality of users, builds a loan wind predicting model by big data and personal loan index data of the users, obtains loan risk results and predicted interval values by matrix transformation in the model, obtains the amount of the loan users to be improved or reduced by comparing the loan risk results with the predicted interval values, predicts the loan risk by the method, can make relatively objective judgment on the loan risk of the users through the big data, and can reduce the deviation of the prediction results caused by overhigh original loan amount, thereby increasing the rationality of the loan.
Description
Technical Field
The invention relates to the field of loan risk models, in particular to a method for constructing and predicting a loan risk model.
Background
When a user applies for a loan, basic information and credit information of the user are often evaluated, and whether the amount of the loan is in a direct proportion to the risk borne, so that the risk of lending the user is reduced.
Disclosure of Invention
The present invention is directed to a loan risk model building and prediction method, which solves one or more of the problems of the prior art and provides at least one of the advantages of the present invention.
A loan risk model construction and prediction method is characterized by comprising the following steps:
s100: identifying loan materials of the user through an OCR recognition technology;
s200: obtaining basic data and loan index data of a user through information in loan materials;
s300: constructing a risk prediction model through the data, and determining a loan weight coefficient through the model;
s400: obtaining a loan risk coefficient through the loan weight coefficient and the prediction result of the model;
s500: and evaluating and predicting the risk coefficient to finally obtain a loan risk result.
Further, in step S100, the loan materials provided by the user are identified and transmitted to the assessment module for assessment by OCR recognition technology, wherein the loan materials include: the bank account and bank flow condition of the user, the work income condition of the user, the loan demand and the loan purpose of the user.
Further, the basic data of the user is obtained by simplifying the information in the loan materials, wherein the basic data comprises: the bank account of the user, the specific income of the user, the real estate assessment of the user and the average monthly flow of the user, the loan risk of the user is preliminarily assessed through the information in the loan materials, loan index data are preliminarily assessed through an assessment module, the risk assessment value Q is obtained by calculating the ratio of the loan index data and the expected loan amount of the user,
further, in step S300, a risk prediction model is constructed through the obtained basic data, loan index data, and risk assessment value, and the method for constructing the model includes:
s301: standardizing data values of loan index data and risk assessment value Q by a MapReduce technology, storing the loan index data and the risk assessment value Q in a distributed manner, distributing the loan index data and the risk assessment value Q to each storage node to obtain node data, establishing a set node with a time sequence, sequentially adding node data of a plurality of users into the set node according to a loan time sequence by using elements as a set, wherein the set node (d) is a node data value of a d-th loan user, d belongs to [1, k ], k is the total quantity of the loan users sampled at this time, and is established into an array N which passes through the array N;
s302: performing information interaction on the set node to form a data model, packaging information in the data model into a data stream for transmission, sequentially packaging node data in the set node according to a time sequence, packaging the data in the set node into a data stream, wherein the data stream comprises the basic data, loan index data and a risk assessment value Q, the sets N and M are empty sets with time sequences, adding the loan index data and a user expected loan amount into the sets N and M respectively through user application time, converting the data in the sets N and M into an array TLN, performing batch division on the data stream, performing task processing, taking the risk assessment value Q as a reference value for performing task processing on the data stream, and obtaining a loan weight coefficient W through calculation,
the node (d) is a data stream of which the d is a user in the set node, the Qd is a risk assessment value of the d is the user, the ni is an element of the set N, the mi is an element of the set M, the loan weight coefficient W is a variable value according to the serial number of a client, an array S is constructed by sequencing the sets N and M through the loan weight coefficient W, and the judgment proportion of the loan amount is adjusted through the loan weight coefficient W;
s303: building an adjustment matrix for the loan weight by substituting the loan weight coefficient into the model to obtain a new prediction result array Ass of the model,x and y are serial numbers of row and column in matrix, x and y are in [1, 2, \ 8230n]Said S pq Is the element of the p-th row and q-th column in the array S, S pp Is the element of the p-th row and p-th column in the array C, S qq For the elements in row qth and column qth of array C,
defining the minimum data values of rows and columns in the array Ass As Ah and As, constructing a set D by screening the data of the rows and columns where the Ah and As are located, carrying out normalization processing on the array of the set D, reconstructing the data of the set D into an array ADE, and calculating to obtain a loan risk result predicted value FCT through the array ADE, wherein the calculation method of the loan risk result predicted value FCT comprises the following steps:
said = ln () is the natural logarithm, said Ass (a) xy ) Is the data of the x row and the y column of the matrix Ass, the exp () is an exponential function with the base being a natural number e, the mean () is an averaging function, and the Max (Ass (a) xy ) Is the maximum value in array Ass, the Max (ADE) is the maximum value of array ADE, and the Min (ADE) is the minimum value of array ADE.
The loan user analysis method has the advantages that a matrix is constructed to analyze loan users through basic data, loan index data and risk assessment values Q of a plurality of users, the minimum data value is screened out through the matrix, the minimum data value is a risk which is smaller, the prediction value FCT is calculated through an array ADE constructed by the minimum data value, and the condition that the prediction value is inaccurate due to overhigh large data loan risk value is avoided.
Further, in step S400, the weight value of the information having high value to the basic information of the user is increased by the loan weight coefficient, the expected loan amount of the user is adjusted by the weight coefficient, the loan risk of the user is reduced, and a loan risk result is obtained by a function calculation, where the function expression is,
the exp () is an exponential function with the base number being a natural number E, the Wd is a loan weight coefficient W of the d-th user, the Qd-th is a risk assessment value of the user, the = ln () is a natural logarithm, the function E () is compared with the pre-estimated value FCT of the loan risk result, and a constraint condition is defined by the comparison result of the function E () and the pre-estimated value FCT of the loan risk result, wherein the constraint condition is thatAnd limits the function E () within the constraints.
( The beneficial effects of this step do: the loan risk result is compared with the calculated risk result predicted value FCT, and the loan risk result is restrained, so that the loan risk result is less influenced by too large or too small loan data, and the deviation of the loan risk result is reduced. )
The method operates on the basis of a system comprising: a memory, a processor and a display, the memory and the processor running a computer program which, when executed by the processor, implement the steps in a method of constructing and predicting a loan risk model according to any of the above methods.
The beneficial effects of the invention are as follows: the loan risk prediction model is constructed through the big data network constructed by the information of most loan users and the basic information of individual users, the interval value of the risk area is obtained through the matrix data transformation in the model, and the individual loan risk of the users is predicted through the interval value, so that the deviation of loan prediction caused by part of oversized or undersized loan data is reduced, and the rationality of loan amount judgment is improved.
Drawings
The above and other features of the invention will be more apparent from the detailed description of the embodiments shown in the accompanying drawings in which like reference characters designate the same or similar elements, and it will be apparent that the drawings in the following description are merely exemplary of the invention and that other drawings may be derived therefrom by those skilled in the art without inventive faculty, and that
In the figure:
FIG. 1 is a flow chart of a loan risk model construction and prediction method.
Detailed Description
The conception, the specific structure and the technical effects of the present invention will be clearly and completely described in conjunction with the embodiments and the accompanying drawings to fully understand the objects, the schemes and the effects of the present invention. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
In the description of the present invention, the meaning of a plurality of means is one or more, the meaning of a plurality of means is two or more, and larger, smaller, larger, etc. are understood as excluding the number, and larger, smaller, inner, etc. are understood as including the number. If the first and second are described for the purpose of distinguishing technical features, they are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
As shown in fig. 1, a method for constructing and predicting a loan risk model is characterized by comprising the following steps:
s100: identifying loan materials of the user through an OCR recognition technology;
s200: obtaining basic data and loan index data of a user through information in loan materials;
s300: constructing a risk prediction model through the data, and determining a loan weight coefficient through the model;
s400: obtaining a loan risk coefficient through the loan weight coefficient and the prediction result of the model;
s500: and evaluating and predicting the risk coefficient to finally obtain a loan risk result.
Further, in step S100, the loan materials provided by the user are identified and transmitted to the assessment module for assessment by OCR recognition technology, wherein the loan materials include: the bank account and bank flow condition of the user, the work income condition of the user, the loan demand and the loan purpose of the user.
Further, the basic data of the user is obtained by simplifying the information in the loan materials, wherein the basic data comprises: the bank account of the user, the specific income of the user, the real estate assessment of the user and the average monthly flow of the user, the loan risk of the user is preliminarily assessed through the information in the loan materials, loan index data are preliminarily assessed through an assessment module, the risk assessment value Q is obtained by calculating the ratio of the loan index data and the expected loan amount of the user,
further, in step S300, a risk prediction model is constructed through the obtained basic data, loan index data, and risk assessment value, and the method for constructing the model is as follows:
s301: standardizing data values of loan index data and risk assessment value Q by a MapReduce technology, storing the loan index data and the risk assessment value Q in a distributed manner, distributing the loan index data and the risk assessment value Q to each storage node to obtain node data, establishing a set node with a time sequence, sequentially adding node data of a plurality of users into the set node according to a loan time sequence by using elements as a set, wherein the set node (d) is a node data value of a d-th loan user, d belongs to [1, k ], k is the total quantity of the loan users sampled at this time, and is established into an array N which passes through the array N;
s302: performing information interaction on the aggregation node to form a data model, packaging information in the data model into data streams for transmission, sequentially packaging node data in the aggregation node according to a time sequence, and packaging the data in the aggregation node into data streams, wherein the data streams comprise the basic data and loan index dataAnd risk assessment value Q, the sets N and M are empty sets with time series, the loan index data and the loan amount expected by the user are respectively added into the sets N and M through the application time of the user, the data in the sets N and M are converted into an array TLN, the data stream is divided into batches and processed in a task, the risk assessment value Q is used as a reference value for processing the data stream in the task, a loan weight coefficient W is obtained through calculation,
the node (d) is a data stream of which the d is a user in the set node, the Qd is a risk assessment value of the d is the user, the ni is an element of the set N, the mi is an element of the set M, the loan weight coefficient W is a variable value according to the serial number of a client, an array S is constructed by sequencing the sets N and M through the loan weight coefficient W, and the judgment proportion of the loan amount is adjusted through the loan weight coefficient W;
s303: substituting the loan weight coefficient into the model to construct an adjustment matrix for the loan weight to obtain a new prediction result array Ass of the model,x and y are serial numbers of row and column in matrix, x and y are in [1, 2, \ 8230n]Said S pq Is the element of the p-th row and q-th column in the array S, S pp Is the element in the pth row and pth column of array C, S qq For the elements in row qth and column qth of array C,
defining the minimum data values of rows and columns in the array Ass As Ah and As, constructing a set D by screening the data of the rows and columns where the Ah and As are located, carrying out normalization processing on the array of the set D, reconstructing the data of the set D into an array ADE, and calculating to obtain a loan risk result predicted value FCT through the array ADE, wherein the calculation method of the loan risk result predicted value FCT comprises the following steps:
said = ln () is the natural logarithm, said Ass (a) xy ) Is the data of the x row and y column of the matrix Ass, the exp () is an exponential function with the base number being a natural number e, the mean () is an averaging function, and the Max (Ass (a) xy ) Is the maximum value in array Ass, max (ADE) is the maximum value of array ADE, and Min (ADE) is the minimum value of array ADE.
The loan user analysis method has the advantages that a matrix is constructed to analyze loan users through basic data, loan index data and risk assessment values Q of a plurality of users, the minimum data value is screened out through the matrix, the minimum data value is a risk which is smaller, the prediction value FCT is calculated through an array ADE constructed by the minimum data value, and the condition that the prediction value is inaccurate due to overhigh large data loan risk value is avoided.
Further, in step S400, the weight value of the information having high value to the basic information of the user is increased by the loan weight coefficient, the expected loan amount of the user is adjusted by the weight coefficient, the loan risk of the user is reduced, and a loan risk result is obtained by a function calculation, where the function expression is,
the exp () is an exponential function with the base number being a natural number E, the Wd is a loan weight coefficient W of the d-th user, the Qd d-th is a risk assessment value of the user, the = ln () is a natural logarithm, the function E () is compared with the predicted value FCT of the loan risk result, a constraint condition is defined by the comparison result of the function E () and the predicted value FCT of the loan risk result, and the constraint condition is thatAnd limits the function E () within the constraints.
( The beneficial effect of this step does: the loan risk result is compared with the calculated risk result predicted value FCT, and the loan risk result is restrained, so that the loan risk result is less influenced by too large or too small loan data, and the deviation of the loan risk result is reduced. )
The method operates on the basis of a system comprising: memory, a processor and a display, the memory and the processor running a computer program which, when executed by the memory and the processor, implement the steps in a loan risk model construction and prediction method of any of the above methods.
The loan risk model building and predicting system can be operated in computing equipment such as desktop computers, notebooks, palm computers, cloud data centers and the like. The loan risk model building and prediction system comprises, but is not limited to, a processor and a memory. It will be understood by those skilled in the art that the example is merely an example of a loan risk model building and prediction system, and does not constitute a limitation of a loan risk model building and prediction system, and may include more or less components than the loan risk model building and prediction system, or may combine some components, or different components, for example, the loan risk model building and prediction system may further include input and output devices, network access devices, buses, etc.
The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete component Gate or transistor logic device, discrete hardware component, or the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, the processor is a control center of the loan risk model building and forecasting system, and various interfaces and lines are used for connecting various subareas of the whole loan risk model building and forecasting system.
The memory may be used to store the computer programs and/or modules, and the processor may implement the various functions of the loan risk model construction and prediction system by executing or executing the computer programs and/or modules stored in the memory and invoking the data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
The invention provides a method for constructing and predicting a loan risk model, which obtains loan index data and expected loan amount of a user by analyzing basic data of loan users, constructs a big data network by analyzing the loan data of a plurality of users, constructs a loan wind prediction model by the big data and the personal loan index data of the users, obtains a loan risk result and a predicted interval value by matrix transformation in the model, and obtains whether the amount of the loan users is increased or decreased by comparing the loan risk result with the predicted interval value.
Although the present invention has been described in considerable detail and with reference to certain illustrated embodiments, it is not intended to be limited to any such details or embodiments or any particular embodiment, so as to effectively encompass the intended scope of the invention. Furthermore, the foregoing describes the invention in terms of embodiments foreseen by the inventor for which an enabling description was available, notwithstanding that insubstantial modifications of the invention, not presently foreseen, may nonetheless represent equivalent modifications thereto.
Claims (6)
1. A method for constructing and predicting a loan risk model, the method comprising the steps of:
s100: identifying loan materials of the user through an OCR recognition technology;
s200: obtaining basic data and loan index data of a user through information in loan materials;
s300: constructing a risk prediction model through the data, and determining a loan weight coefficient through the model;
s400: obtaining a loan risk coefficient through the loan weight coefficient and the prediction result of the model;
s500: and evaluating and predicting the risk coefficient to finally obtain a loan risk result.
2. The method for constructing and predicting the loan risk model according to claim 1, wherein in step S100, the loan materials provided by the user are identified and transmitted to the assessment module for assessment by OCR recognition technology, wherein the loan materials include: the bank account and bank flow condition of the user, the work income condition of the user, the loan demand and the loan purpose of the user.
3. The method for constructing and predicting the loan risk model according to claim 1, wherein in step S200, the basic data of the user is obtained by simplifying the information in the loan materials, wherein the basic data includes: the method comprises the steps that bank accounts of users, specific income of the users, real estate assessment of the users and average monthly flow of the users are carried out, loan risks of the users are preliminarily assessed through information in loan materials, loan index data are preliminarily assessed through an assessment module, and the loan index data of the users are comparedThe ratio of the expected loan amounts is calculated to yield a risk assessment value Q,
4. the method for constructing and predicting the loan risk model according to claim 1, wherein in step S300, the risk prediction model is constructed by using the obtained basic data, loan index data and risk assessment value, and the method for constructing the model is as follows:
s301: carrying out standardization processing on data values of loan index data and a risk evaluation value Q through a MapReduce technology, storing the loan index data and the risk evaluation value Q in a distributed mode, distributing the loan index data and the risk evaluation value Q into each storage node to obtain node data, establishing a set node with a time sequence, sequentially adding node data of a plurality of users into the set node according to a loan time sequence by using elements as a set, wherein the set node (d) is the node data value of the d-th loan user, d belongs to [1, k ], the k is the total number of the loan users sampled at this time, and the set node is established into an array N through the array N;
s302: performing information interaction on the set node to form a data model, packaging information in the data model into a data stream for transmission, sequentially packaging node data in the set node according to a time sequence, packaging the data in the set node into a data stream, wherein the data stream comprises the basic data, loan index data and a risk assessment value Q, the sets N and M are empty sets with time sequences, adding the loan index data and a user expected loan amount into the sets N and M respectively through user application time, converting the data in the sets N and M into an array TLN, performing batch division on the data stream, performing task processing, taking the risk assessment value Q as a reference value for performing task processing on the data stream, and obtaining a loan weight coefficient W through calculation,
the node (d) is a data stream of which the d is a user in the set node, the Qd is a risk assessment value of the user, the ni is an element of the set N, the mi is an element of the set M, the loan weight coefficient W is a variable value according to the serial number of the customer, an array S is constructed by the loan weight coefficient W through the sequencing of the sets N and M, and the judgment proportion of the loan amount is adjusted through the loan weight coefficient W;
s303: substituting the loan weight coefficient into the model to construct an adjustment matrix for the loan weight to obtain a new prediction result array Ass of the model,x and y are serial numbers of row and column in matrix, x and y are in [1, 2, \ 8230n]Said S pq Is the element of the p-th row and q-th column in the array S, S pp Is the element in the pth row and pth column of array C, S qq Is the element in the qth row and the qth column of array C,
defining the minimum data values of rows and columns in the array Ass As Ah and As, constructing a set D by screening the data of the rows and columns where the Ah and As are located, carrying out normalization processing on the array of the set D, reconstructing the data of the set D into an array ADE, and calculating to obtain a loan risk result predicted value FCT through the array ADE, wherein the calculation method of the loan risk result predicted value FCT comprises the following steps:
said = ln () is the natural logarithm, said Ass (a) xy ) Is the data of the x row and the y column of the matrix Ass, the exp () is an exponential function with the base being a natural number e, the mean () is an averaging function, and the Max (Ass (a) xy ) Is an array of AsThe maximum value in s, max (ADE) being the maximum value of array ADE, min (ADE) being the minimum value of array ADE.
5. The method for constructing and predicting the loan risk model according to claim 1, wherein in step S400, the weight value of the information having high value to the basic information of the user is increased by the loan weight coefficient, the expected loan amount of the user is adjusted by the weight coefficient, the loan risk of the user is reduced, and the loan risk result is obtained by a function calculation, wherein the function expression is,
the exp () is an exponential function with the base number being a natural number E, the Wd is a loan weight coefficient W of the d-th user, the Qd-th is a risk assessment value of the user, the = ln () is a natural logarithm, the function E () is compared with the pre-estimated value FCT of the loan risk result, and a constraint condition is defined by the comparison result of the function E () and the pre-estimated value FCT of the loan risk result, wherein the constraint condition is thatAnd limits the function E () within the constraints.
6. The loan risk model building and prediction method according to claim 1, wherein the method operates based on a system comprising: memory, processor and display, the memory and processor running a computer program which when executed by the memory and processor implement the steps in a method of constructing and predicting a loan risk model according to any of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211267567.6A CN115660820A (en) | 2022-10-17 | 2022-10-17 | Loan risk model construction and prediction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211267567.6A CN115660820A (en) | 2022-10-17 | 2022-10-17 | Loan risk model construction and prediction method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115660820A true CN115660820A (en) | 2023-01-31 |
Family
ID=84988134
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211267567.6A Pending CN115660820A (en) | 2022-10-17 | 2022-10-17 | Loan risk model construction and prediction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115660820A (en) |
-
2022
- 2022-10-17 CN CN202211267567.6A patent/CN115660820A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021174944A1 (en) | Message push method based on target activity, and related device | |
CN111435463B (en) | Data processing method, related equipment and system | |
CN111222556B (en) | Method and system for identifying electricity utilization category based on decision tree algorithm | |
JP6484730B2 (en) | Collaborative filtering method, apparatus, server, and storage medium for fusing time factors | |
CN112528025A (en) | Text clustering method, device and equipment based on density and storage medium | |
CN107818344A (en) | The method and system that user behavior is classified and predicted | |
CN110363650B (en) | Method, device and system for predicting mobile support will of stock users | |
CN113177700B (en) | Risk assessment method, system, electronic equipment and storage medium | |
CN112906865B (en) | Neural network architecture searching method and device, electronic equipment and storage medium | |
CN112887371A (en) | Edge calculation method and device, computer equipment and storage medium | |
CN114154557A (en) | Cancer tissue classification method, apparatus, electronic device, and storage medium | |
CN117172830A (en) | Prediction model construction method and system for electronic commerce data analysis | |
CN113723514A (en) | Safe access log data balance processing method based on hybrid sampling | |
CN109409915B (en) | Automobile part sales prediction method, terminal equipment and storage medium | |
CN110059749B (en) | Method and device for screening important features and electronic equipment | |
WO2023185125A1 (en) | Product resource data processing method and apparatus, electronic device and storage medium | |
CN116894721A (en) | Index prediction method and device and computer equipment | |
CN115660820A (en) | Loan risk model construction and prediction method | |
CN114170000A (en) | Credit card user risk category identification method, device, computer equipment and medium | |
CN113379533A (en) | Method, device, equipment and storage medium for improving circulating loan quota | |
Rong et al. | Exploring network behavior using cluster analysis | |
CN110399430A (en) | User characteristics determine method, apparatus, equipment and computer readable storage medium | |
CN113159957B (en) | Transaction processing method and device | |
CN114331164B (en) | Maturity evaluation method and device for learning management system and electronic equipment | |
CN115860815A (en) | Merchant preference distribution method, device, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |