CN108492049A - A kind of system for the P2P platform operation risk assessment that logic-based returns - Google Patents
A kind of system for the P2P platform operation risk assessment that logic-based returns Download PDFInfo
- Publication number
- CN108492049A CN108492049A CN201810301074.7A CN201810301074A CN108492049A CN 108492049 A CN108492049 A CN 108492049A CN 201810301074 A CN201810301074 A CN 201810301074A CN 108492049 A CN108492049 A CN 108492049A
- Authority
- CN
- China
- Prior art keywords
- data
- feature
- platform
- logic
- risk
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0635—Risk analysis of enterprise or organisation activities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/03—Credit; Loans; Processing thereof
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- General Business, Economics & Management (AREA)
- Finance (AREA)
- Development Economics (AREA)
- Marketing (AREA)
- Accounting & Taxation (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Educational Administration (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Game Theory and Decision Science (AREA)
- Technology Law (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
The present invention relates to a kind of systems for the P2P platform operation risk assessment that logic-based returns, the present invention goes out alternative features table by carrying out analytic induction to a large amount of P2P platform datas, then the feature in alternative features table being divided into value data characteristic and data, there are characteristics, correlation analysis is carried out by the risk index to the two characteristics and platform, model is trained to select main feature, ensure the accuracy rate of model and improves working efficiency, and establish the risk evaluation model of logic-based recurrence, the deployment of logistic regression algorithm is simple easy, the result obtained is probability value, it can be taking human as determining cutoff, training speed is fast, re -training cost is small, by updating formula method training pattern, so that model is more accurate.
Description
Technical field
The present invention relates to a kind of systems of risk assessment, and in particular to a kind of P2P platform operation wind that logic-based returns
The system nearly assessed.
Background technology
P2P is the abbreviation of English person-to-person, imply that it is individual-to-individual, also known as point to point network borrow money, be
It is a kind of that microfinance is gathered together into debt-credit to there is a kind of civil small amount of credit requirement crowd to borrow or lend money pattern, belong to internet gold
The one kind for melting product belongs to civil small amount debt-credit, by internet, the network lending platforms of development of Mobile Internet technology and related reason
Wealth behavior, financial service.
The domestic up to thousands of families of P2P network loan platforms at present, borrower will be not only directed to the analysis of P2P platforms
Credit risk analysis, and the operations risks of platform itself are analyzed, need to judge by some data a certain
A platform is bad, that is, whether there is the risk that runs away, however the result that different risk evaluation models obtains is with accuracy rate
Different.
Invention content
It is an object of the present invention to disclose a kind of system for the P2P platform operation risk assessment that logic-based returns, provide
A kind of risk evaluation model method for building up that logic-based returns, and it is by projectional technique that risk evaluation model is more trained
It is accurate to add.
Realizing the technical solution of the system for the P2P platform operation risk assessment that logic-based of the present invention returns is:
A kind of system for the P2P platform operation risk assessment that logic-based returns, includes the following steps:
1) first time data acquire:Obtain the operation data of multiple P2P network loan platforms;
2) alternative features table is established:The operation data obtained to the acquisition of first time data is screened and is extracted, and is extracted
Feature and data corresponding with feature, data definition corresponding with feature are characterized data, the pattern according to characteristic
Tagsort is established alternative features table by feature;
3) degree of correlation analysis and Feature Selection:Pearson correlation coefficients are used to feature, Spearman's correlation coefficient, are agreed
Dare related coefficient and p value carry out correlation analysis, and by above-mentioned four kinds of analysis methods, each feature obtains 4 assay values,
0.4 or more correlation analysis absolute value, while 0.005 feature below of P values are chosen, this Partial Feature is defined as main spy
Sign;
4) model training collection is built:The normal platform and first predetermined amount of first predetermined amount in taking first time data to acquire
The data of escape platform extract the main feature data of these platforms as training set, training set are organized into { (X(1),Y(1)),(X(2),Y(2))...(X(m),Y(m)), X(m)It is the vector of main feature data, Y(m)The standing state of platform, escape for
1, it is normally 0;
5) risk evaluation model is established:The basic function of logistic regression is established, mathematical form is:
In logistic regression, the sample that training set is crossed by m group echos is constituted:{(X(1),Y(1)),(X(2),Y(2))...(X(m),Y(m)), it is defeated
Enter to be characterized as X(m), the dimension of feature vector, X is n+1, Xn+ 1=1.0 is intercept item, and logistic regression is two classification problems of processing,
Class label is Y(m)∈ { 0,1 }, so pattern function is as follows:Using the method for seeking maximum likelihood function, patrol
Collecting recurrence likelihood function is:Wherein m indicates sample size, by logic
After recurrence likelihood function takes logarithm, it is expressed as:Above-mentioned function is
Above-mentioned function using gradient rising to acquire maximum likelihood function value, or is multiplied by -1, becomes underpick mechanism by one Convex Functions,
Minimal negative likelihood function value is acquired using gradient decline, so the loss function of logistic regression is:Using
Stochastic gradient descent method, to solve the parameter value of equation group, the more new formula that gradient declines θ is as follows:
Wherein, the derivation process of J (θ) is as follows:
, so the more new formula that gradient declines θ is as follows:Wherein α is study speed
Rate, with update, J (θ) gradually approaches minimum value, is finally stopped update, so as to find out the parameter of model, so that it is determined that most
Final cast.
7) secondary data acquires:Obtain the operation data for the P2P network loan platforms for needing to assess;
8) data prediction:Main feature and main feature number are extracted in the operation data that secondary data acquisition obtains
According to;
9) risk exports:Main feature after data prediction is put into the risk after training with main feature data
In prediction model, value-at-risk is obtained, value-at-risk is input in determining device, to export degree of risk.
Further, the acquisition of first time data in second data acquisition, crawled by network or P2P nets
Network loan platform presentation mode obtains the operation data of P2P network loan platforms.
Further, the pattern feature includes three kinds of numeric type, character type and judgement type.
Further, the first predetermined amount is 100~150, and the second predetermined amount is 75~100.
Further, the second predetermined amount is at least the half of the first predetermined amount.
Beneficial effects of the present invention are:The present invention goes out alternative features by carrying out analytic induction to a large amount of P2P platform datas
Then feature in alternative features table is divided into value data characteristic and data there are characteristic by table, by the two characteristics with
The risk index of platform carries out correlation analysis, is trained to model to select main feature, it is ensured that model it is accurate
Rate and raising working efficiency, and the risk evaluation model of logic-based recurrence is established, the deployment of logistic regression algorithm is simple easy, obtains
The result gone out is probability value, can be taking human as cutoff is determined, training speed is fast, and re -training cost is small, passes through more new formula side
Method training pattern so that model is more accurate.
Specific implementation mode
Below in conjunction with the embodiment of the present invention, technical scheme in the embodiment of the invention is clearly and completely described,
Obviously, described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based in the present invention
Embodiment, every other embodiment obtained by those of ordinary skill in the art without making creative efforts, all
Belong to the scope of protection of the invention.
Embodiment:A kind of system for the P2P platform operation risk assessment that logic-based returns, includes the following steps:
A kind of system for the P2P platform operation risk assessment that logic-based returns, includes the following steps:
1) first time data acquire:Obtain the operation data of multiple P2P network loan platforms;
2) alternative features table is established:The operation data obtained to the acquisition of first time data is screened and is extracted, and is extracted
Feature and data corresponding with feature, data definition corresponding with feature are characterized data, the pattern according to characteristic
Tagsort is established alternative features table by feature;
3) degree of correlation analysis and Feature Selection:Pearson correlation coefficients are used to feature, Spearman's correlation coefficient, are agreed
Dare related coefficient and p value carry out correlation analysis, and by above-mentioned four kinds of analysis methods, each feature obtains 4 assay values,
0.4 or more correlation analysis absolute value, while 0.005 feature below of P values are chosen, this Partial Feature is defined as main spy
Sign, the definition of p value determines the accuracy of correlation, according to definition, when p=0.05 in sample variable association have 5% can
Can be due to caused by contingency, variable association has 0.5% may be since contingency causes in sample when p=0.005
, the selections of P values determine main feature number, by repeatedly simulating, it is believed that most reasonable when using P=0.005;
4) model training collection is built:The normal platform and first predetermined amount of first predetermined amount in taking first time data to acquire
The data of escape platform extract the main feature data of these platforms as training set, training set are organized into { (X(1),Y(1)),(X(2),Y(2))...(X(m),Y(m)), X(m)It is the vector of main feature data, Y(m)The standing state of platform, escape for
1, it is normally 0;
5) risk evaluation model is established:The basic function of logistic regression is established, mathematical form is:
In logistic regression, the sample that training set is crossed by m group echos is constituted:{(X(1),Y(1)),(X(2),Y(2))...(X(m),Y(m)), it is defeated
Enter to be characterized as X(m), the dimension of feature vector, X is n+1, Xn+ 1=1.0 is intercept item, and logistic regression is two classification problems of processing,
Class label is Y(m)∈ { 0,1 }, so pattern function is as follows:Using the method for seeking maximum likelihood function,
Logistic regression likelihood function is:Wherein m indicates sample size, will patrol
It volume returns after likelihood function takes logarithm, is expressed as:Above-mentioned function
It is a Convex Functions, using gradient rising to acquire maximum likelihood function value, or above-mentioned function is multiplied by -1, it is convex under becoming
Function acquires minimal negative likelihood function value, so the loss function of logistic regression is using gradient decline:
Using stochastic gradient descent method, to solve the parameter value of equation group, gradient declines the more new formula of θ
It is as follows:Wherein, the derivation process of J (θ) is as follows:
, so the more new formula that gradient declines θ is as follows:Wherein α is study speed
Rate, with update, J (θ) gradually approaches minimum value, is finally stopped update, so as to find out the parameter of model, so that it is determined that most
Final cast.
7) secondary data acquires:Obtain the operation data for the P2P network loan platforms for needing to assess;
8) data prediction:Main feature and main feature number are extracted in the operation data that secondary data acquisition obtains
According to;
9) risk exports:Main feature after data prediction is put into the risk after training with main feature data
In prediction model, value-at-risk is obtained, value-at-risk is input in determining device, to export degree of risk.
In the acquisition of first time data and second of data acquisition, is crawled by network or P2P network loans are flat
Platform presentation mode obtains the operation data of P2P network loan platforms.
The pattern feature includes three kinds of numeric type, character type and judgement type.
First predetermined amount is 100~150, and the second predetermined amount is 75~100.
Second predetermined amount is at least the half of the first predetermined amount.
Beneficial effects of the present invention are:The present invention goes out alternative features by carrying out analytic induction to a large amount of P2P platform datas
Then feature in alternative features table is divided into value data characteristic and data there are characteristic by table, by the two characteristics with
The risk index of platform carries out correlation analysis, is trained to model to select main feature, it is ensured that model it is accurate
Rate and raising working efficiency, and the risk evaluation model of logic-based recurrence is established, the deployment of logistic regression algorithm is simple easy, obtains
The result gone out is probability value, can be taking human as cutoff is determined, training speed is fast, and re -training cost is small, passes through more new formula side
Method training pattern so that model is more accurate.
The operation principle of the present embodiment:It is crawled first by network or the acquisition of P2P network loan platform presentation modes is more
P2P network loan platforms are divided into normal operation platform and the platform that runs away, extraction by the operation data of a P2P network loan platforms
The feature and characteristic for going out each platform achieve, and next can use in model training and test, by the spy of each platform
Sign is put into the alternative table of feature, constitutes an alternative features table, correlation is carried out to each feature in alternative features table
Analysis, extracts main feature, builds the risk forecast model of logic-based recurrence, is borrowed using collected multiple P2P networks
The main feature and main feature data for borrowing platform are trained risk forecast model, after determining model, are climbed by network
It takes or P2P network loan platform presentation modes obtains the operation data of P2P network loan platforms for needing to assess, then to flat
The operation data of platform is pre-processed, and main feature is precisely extracted by data prediction and main feature data are put into danger
In prediction model, value-at-risk is obtained, value-at-risk is input in determining device, to export degree of risk.
Obviously, described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based on this
Embodiment in invention, every other reality obtained by those of ordinary skill in the art without making creative efforts
Example is applied, shall fall within the protection scope of the present invention.
Claims (5)
1. a kind of system for the P2P platform operation risk assessment that logic-based returns, which is characterized in that include the following steps:
1) first time data acquire:Obtain the operation data of multiple P2P network loan platforms;
2) alternative features table is established:The operation data obtained to the acquisition of first time data is screened and is extracted, and feature is extracted
The corresponding data with feature, data definition corresponding with feature are characterized data, the pattern feature according to characteristic
Tagsort is established into alternative features table;
3) degree of correlation analysis and Feature Selection:To feature using Pearson correlation coefficients, Spearman's correlation coefficient, Ken Deer
Related coefficient and p value carry out correlation analysis, and by above-mentioned four kinds of analysis methods, each feature obtains 4 assay values, choose
0.4 or more correlation analysis absolute value, while 0.005 feature below of P values, main feature is defined as by this Partial Feature;
4) model training collection is built:The normal platform of first predetermined amount and escaping for the first predetermined amount in taking first time data to acquire
The data of platform extract the main feature data of these platforms as training set, training set are organized into { (X(1),Y(1)),(X(2),Y(2))...(X(m),Y(m)), X(m)It is the vector of main feature data, Y(m)It is the standing state of platform, it is 1 to escape, normally
It is 0;
5) risk evaluation model is established:The basic function of logistic regression is established, mathematical form is:It is patrolling
It collects in returning, the sample that training set is crossed by m group echos is constituted:{(X(1),Y(1)),(X(2),Y(2))...(X(m),Y(m)), input is special
Sign is X(m), the dimension of feature vector, X is n+1, Xn+ 1=1.0 is intercept item, and logistic regression is two classification problems of processing, category
Label are Y(m)∈ { 0,1 }, so pattern function is as follows:Using the method for seeking maximum likelihood function, logic
Returning likelihood function is:Wherein m indicates sample size, and logic is returned
After returning likelihood function to take logarithm, it is expressed as:Above-mentioned function is one
Above-mentioned function using gradient rising to acquire maximum likelihood function value, or is multiplied by -1, convex letter under becoming by a Convex Functions
Number, acquires minimal negative likelihood function value, so the loss function of logistic regression is using gradient decline:
Using stochastic gradient descent method, to solve the parameter value of equation group, the more new formula that gradient declines θ is as follows:
Wherein, the derivation process of J (θ) is as follows:
,
So the more new formula that gradient declines θ is as follows:Wherein α is learning rate, with
Update, J (θ) gradually approaches minimum value, is finally stopped update, so as to find out the parameter of model, so that it is determined that final mask.
7) secondary data acquires:Obtain the operation data for the P2P network loan platforms for needing to assess;
8) data prediction:Main feature and main feature data are extracted in the operation data that secondary data acquisition obtains;
9) risk exports:Main feature after data prediction is put into the risk profile after training with main feature data
In model, value-at-risk is obtained, value-at-risk is input in determining device, to export degree of risk.
2. the system for the P2P platform operation risk assessment that logic-based returns according to claim 1, which is characterized in that
First time data are acquired and in second data acquisition, are crawled by network or P2P network loan platform presentation modes
Obtain the operation data of P2P network loan platforms.
3. the system for the P2P platform operation risk assessment that logic-based returns according to claim 1, which is characterized in that institute
It includes three kinds of numeric type, character type and judgement type to state pattern feature.
4. the system for the P2P platform operation risk assessment that logic-based returns according to claim 1, which is characterized in that the
One predetermined amount is 100~150, and the second predetermined amount is 75~100.
5. the system for the P2P platform operation risk assessment that logic-based returns according to claim 1, which is characterized in that the
Two predetermined amounts are at least the half of the first predetermined amount.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810301074.7A CN108492049A (en) | 2018-04-04 | 2018-04-04 | A kind of system for the P2P platform operation risk assessment that logic-based returns |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810301074.7A CN108492049A (en) | 2018-04-04 | 2018-04-04 | A kind of system for the P2P platform operation risk assessment that logic-based returns |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108492049A true CN108492049A (en) | 2018-09-04 |
Family
ID=63314864
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810301074.7A Pending CN108492049A (en) | 2018-04-04 | 2018-04-04 | A kind of system for the P2P platform operation risk assessment that logic-based returns |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108492049A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111369341A (en) * | 2020-03-05 | 2020-07-03 | 厦门正北科技有限公司 | Intelligent risk scoring system for clients before automobile financial loan |
CN112418738A (en) * | 2020-12-17 | 2021-02-26 | 泸州银行股份有限公司 | Staff operation risk prediction method based on logistic regression |
CN113205219A (en) * | 2021-05-12 | 2021-08-03 | 大连大学 | Agricultural water quality prediction method based on gradient descent optimization logistic regression algorithm |
CN116070725A (en) * | 2022-08-29 | 2023-05-05 | 山东科技大学 | Mining pressure risk prediction method based on logistic regression |
-
2018
- 2018-04-04 CN CN201810301074.7A patent/CN108492049A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111369341A (en) * | 2020-03-05 | 2020-07-03 | 厦门正北科技有限公司 | Intelligent risk scoring system for clients before automobile financial loan |
CN112418738A (en) * | 2020-12-17 | 2021-02-26 | 泸州银行股份有限公司 | Staff operation risk prediction method based on logistic regression |
CN113205219A (en) * | 2021-05-12 | 2021-08-03 | 大连大学 | Agricultural water quality prediction method based on gradient descent optimization logistic regression algorithm |
CN116070725A (en) * | 2022-08-29 | 2023-05-05 | 山东科技大学 | Mining pressure risk prediction method based on logistic regression |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108492049A (en) | A kind of system for the P2P platform operation risk assessment that logic-based returns | |
CN106779755A (en) | A kind of network electric business borrows or lends money methods of risk assessment and model | |
CN108734380B (en) | Risk account determination method and device and computing equipment | |
CN109492945A (en) | Business risk identifies monitoring method, device, equipment and storage medium | |
CN110390465A (en) | Air control analysis and processing method, device and the computer equipment of business datum | |
CN107194803A (en) | A kind of P2P nets borrow the device of borrower's assessing credit risks | |
CN108876600A (en) | Warning information method for pushing, device, computer equipment and medium | |
CN106530078A (en) | Loan risk early warning method and system based on multi-industry data | |
CN102163310A (en) | Information pushing method and device based on credit rating of user | |
CN106503873A (en) | A kind of prediction user follows treaty method, device and the computing device of probability | |
CN110111113B (en) | Abnormal transaction node detection method and device | |
CN107784312A (en) | Machine learning model training method and device | |
CN105354210A (en) | Mobile game payment account behavior data processing method and apparatus | |
CN110147823A (en) | A kind of air control model training method, device and equipment | |
CN110415111A (en) | Merge the method for logistic regression credit examination & approval with expert features based on user data | |
CN108009911A (en) | A kind of method of identification P2P network loan borrower's default risks | |
CN108492001A (en) | A method of being used for guaranteed loan network risk management | |
CN107633455A (en) | Credit estimation method and device based on data model | |
CN111476296A (en) | Sample generation method, classification model training method, identification method and corresponding devices | |
CN106484919A (en) | A kind of industrial sustainability sorting technique based on webpage autonomous word and system | |
CN107545038A (en) | A kind of file classification method and equipment | |
CN108711100A (en) | A kind of system of the P2P platform operation risk assessment based on neural network | |
CN112561320A (en) | Training method of mechanism risk prediction model, mechanism risk prediction method and device | |
CN116468300A (en) | Army general hospital discipline assessment method and system based on neural network | |
CN108492050A (en) | A kind of P2P network loan platforms operations risks assessment system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180904 |