CN108492049A - A kind of system for the P2P platform operation risk assessment that logic-based returns - Google Patents

A kind of system for the P2P platform operation risk assessment that logic-based returns Download PDF

Info

Publication number
CN108492049A
CN108492049A CN201810301074.7A CN201810301074A CN108492049A CN 108492049 A CN108492049 A CN 108492049A CN 201810301074 A CN201810301074 A CN 201810301074A CN 108492049 A CN108492049 A CN 108492049A
Authority
CN
China
Prior art keywords
data
feature
platform
logic
risk
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810301074.7A
Other languages
Chinese (zh)
Inventor
冯世程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201810301074.7A priority Critical patent/CN108492049A/en
Publication of CN108492049A publication Critical patent/CN108492049A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Accounting & Taxation (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Educational Administration (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Technology Law (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The present invention relates to a kind of systems for the P2P platform operation risk assessment that logic-based returns, the present invention goes out alternative features table by carrying out analytic induction to a large amount of P2P platform datas, then the feature in alternative features table being divided into value data characteristic and data, there are characteristics, correlation analysis is carried out by the risk index to the two characteristics and platform, model is trained to select main feature, ensure the accuracy rate of model and improves working efficiency, and establish the risk evaluation model of logic-based recurrence, the deployment of logistic regression algorithm is simple easy, the result obtained is probability value, it can be taking human as determining cutoff, training speed is fast, re -training cost is small, by updating formula method training pattern, so that model is more accurate.

Description

A kind of system for the P2P platform operation risk assessment that logic-based returns
Technical field
The present invention relates to a kind of systems of risk assessment, and in particular to a kind of P2P platform operation wind that logic-based returns The system nearly assessed.
Background technology
P2P is the abbreviation of English person-to-person, imply that it is individual-to-individual, also known as point to point network borrow money, be It is a kind of that microfinance is gathered together into debt-credit to there is a kind of civil small amount of credit requirement crowd to borrow or lend money pattern, belong to internet gold The one kind for melting product belongs to civil small amount debt-credit, by internet, the network lending platforms of development of Mobile Internet technology and related reason Wealth behavior, financial service.
The domestic up to thousands of families of P2P network loan platforms at present, borrower will be not only directed to the analysis of P2P platforms Credit risk analysis, and the operations risks of platform itself are analyzed, need to judge by some data a certain A platform is bad, that is, whether there is the risk that runs away, however the result that different risk evaluation models obtains is with accuracy rate Different.
Invention content
It is an object of the present invention to disclose a kind of system for the P2P platform operation risk assessment that logic-based returns, provide A kind of risk evaluation model method for building up that logic-based returns, and it is by projectional technique that risk evaluation model is more trained It is accurate to add.
Realizing the technical solution of the system for the P2P platform operation risk assessment that logic-based of the present invention returns is:
A kind of system for the P2P platform operation risk assessment that logic-based returns, includes the following steps:
1) first time data acquire:Obtain the operation data of multiple P2P network loan platforms;
2) alternative features table is established:The operation data obtained to the acquisition of first time data is screened and is extracted, and is extracted Feature and data corresponding with feature, data definition corresponding with feature are characterized data, the pattern according to characteristic Tagsort is established alternative features table by feature;
3) degree of correlation analysis and Feature Selection:Pearson correlation coefficients are used to feature, Spearman's correlation coefficient, are agreed Dare related coefficient and p value carry out correlation analysis, and by above-mentioned four kinds of analysis methods, each feature obtains 4 assay values, 0.4 or more correlation analysis absolute value, while 0.005 feature below of P values are chosen, this Partial Feature is defined as main spy Sign;
4) model training collection is built:The normal platform and first predetermined amount of first predetermined amount in taking first time data to acquire The data of escape platform extract the main feature data of these platforms as training set, training set are organized into { (X(1),Y(1)),(X(2),Y(2))...(X(m),Y(m)), X(m)It is the vector of main feature data, Y(m)The standing state of platform, escape for 1, it is normally 0;
5) risk evaluation model is established:The basic function of logistic regression is established, mathematical form is: In logistic regression, the sample that training set is crossed by m group echos is constituted:{(X(1),Y(1)),(X(2),Y(2))...(X(m),Y(m)), it is defeated Enter to be characterized as X(m), the dimension of feature vector, X is n+1, Xn+ 1=1.0 is intercept item, and logistic regression is two classification problems of processing, Class label is Y(m)∈ { 0,1 }, so pattern function is as follows:Using the method for seeking maximum likelihood function, patrol Collecting recurrence likelihood function is:Wherein m indicates sample size, by logic After recurrence likelihood function takes logarithm, it is expressed as:Above-mentioned function is Above-mentioned function using gradient rising to acquire maximum likelihood function value, or is multiplied by -1, becomes underpick mechanism by one Convex Functions, Minimal negative likelihood function value is acquired using gradient decline, so the loss function of logistic regression is:Using Stochastic gradient descent method, to solve the parameter value of equation group, the more new formula that gradient declines θ is as follows: Wherein, the derivation process of J (θ) is as follows:
, so the more new formula that gradient declines θ is as follows:Wherein α is study speed Rate, with update, J (θ) gradually approaches minimum value, is finally stopped update, so as to find out the parameter of model, so that it is determined that most Final cast.
7) secondary data acquires:Obtain the operation data for the P2P network loan platforms for needing to assess;
8) data prediction:Main feature and main feature number are extracted in the operation data that secondary data acquisition obtains According to;
9) risk exports:Main feature after data prediction is put into the risk after training with main feature data In prediction model, value-at-risk is obtained, value-at-risk is input in determining device, to export degree of risk.
Further, the acquisition of first time data in second data acquisition, crawled by network or P2P nets Network loan platform presentation mode obtains the operation data of P2P network loan platforms.
Further, the pattern feature includes three kinds of numeric type, character type and judgement type.
Further, the first predetermined amount is 100~150, and the second predetermined amount is 75~100.
Further, the second predetermined amount is at least the half of the first predetermined amount.
Beneficial effects of the present invention are:The present invention goes out alternative features by carrying out analytic induction to a large amount of P2P platform datas Then feature in alternative features table is divided into value data characteristic and data there are characteristic by table, by the two characteristics with The risk index of platform carries out correlation analysis, is trained to model to select main feature, it is ensured that model it is accurate Rate and raising working efficiency, and the risk evaluation model of logic-based recurrence is established, the deployment of logistic regression algorithm is simple easy, obtains The result gone out is probability value, can be taking human as cutoff is determined, training speed is fast, and re -training cost is small, passes through more new formula side Method training pattern so that model is more accurate.
Specific implementation mode
Below in conjunction with the embodiment of the present invention, technical scheme in the embodiment of the invention is clearly and completely described, Obviously, described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based in the present invention Embodiment, every other embodiment obtained by those of ordinary skill in the art without making creative efforts, all Belong to the scope of protection of the invention.
Embodiment:A kind of system for the P2P platform operation risk assessment that logic-based returns, includes the following steps:
A kind of system for the P2P platform operation risk assessment that logic-based returns, includes the following steps:
1) first time data acquire:Obtain the operation data of multiple P2P network loan platforms;
2) alternative features table is established:The operation data obtained to the acquisition of first time data is screened and is extracted, and is extracted Feature and data corresponding with feature, data definition corresponding with feature are characterized data, the pattern according to characteristic Tagsort is established alternative features table by feature;
3) degree of correlation analysis and Feature Selection:Pearson correlation coefficients are used to feature, Spearman's correlation coefficient, are agreed Dare related coefficient and p value carry out correlation analysis, and by above-mentioned four kinds of analysis methods, each feature obtains 4 assay values, 0.4 or more correlation analysis absolute value, while 0.005 feature below of P values are chosen, this Partial Feature is defined as main spy Sign, the definition of p value determines the accuracy of correlation, according to definition, when p=0.05 in sample variable association have 5% can Can be due to caused by contingency, variable association has 0.5% may be since contingency causes in sample when p=0.005 , the selections of P values determine main feature number, by repeatedly simulating, it is believed that most reasonable when using P=0.005;
4) model training collection is built:The normal platform and first predetermined amount of first predetermined amount in taking first time data to acquire The data of escape platform extract the main feature data of these platforms as training set, training set are organized into { (X(1),Y(1)),(X(2),Y(2))...(X(m),Y(m)), X(m)It is the vector of main feature data, Y(m)The standing state of platform, escape for 1, it is normally 0;
5) risk evaluation model is established:The basic function of logistic regression is established, mathematical form is: In logistic regression, the sample that training set is crossed by m group echos is constituted:{(X(1),Y(1)),(X(2),Y(2))...(X(m),Y(m)), it is defeated Enter to be characterized as X(m), the dimension of feature vector, X is n+1, Xn+ 1=1.0 is intercept item, and logistic regression is two classification problems of processing, Class label is Y(m)∈ { 0,1 }, so pattern function is as follows:Using the method for seeking maximum likelihood function, Logistic regression likelihood function is:Wherein m indicates sample size, will patrol It volume returns after likelihood function takes logarithm, is expressed as:Above-mentioned function It is a Convex Functions, using gradient rising to acquire maximum likelihood function value, or above-mentioned function is multiplied by -1, it is convex under becoming Function acquires minimal negative likelihood function value, so the loss function of logistic regression is using gradient decline:
Using stochastic gradient descent method, to solve the parameter value of equation group, gradient declines the more new formula of θ It is as follows:Wherein, the derivation process of J (θ) is as follows:
, so the more new formula that gradient declines θ is as follows:Wherein α is study speed Rate, with update, J (θ) gradually approaches minimum value, is finally stopped update, so as to find out the parameter of model, so that it is determined that most Final cast.
7) secondary data acquires:Obtain the operation data for the P2P network loan platforms for needing to assess;
8) data prediction:Main feature and main feature number are extracted in the operation data that secondary data acquisition obtains According to;
9) risk exports:Main feature after data prediction is put into the risk after training with main feature data In prediction model, value-at-risk is obtained, value-at-risk is input in determining device, to export degree of risk.
In the acquisition of first time data and second of data acquisition, is crawled by network or P2P network loans are flat Platform presentation mode obtains the operation data of P2P network loan platforms.
The pattern feature includes three kinds of numeric type, character type and judgement type.
First predetermined amount is 100~150, and the second predetermined amount is 75~100.
Second predetermined amount is at least the half of the first predetermined amount.
Beneficial effects of the present invention are:The present invention goes out alternative features by carrying out analytic induction to a large amount of P2P platform datas Then feature in alternative features table is divided into value data characteristic and data there are characteristic by table, by the two characteristics with The risk index of platform carries out correlation analysis, is trained to model to select main feature, it is ensured that model it is accurate Rate and raising working efficiency, and the risk evaluation model of logic-based recurrence is established, the deployment of logistic regression algorithm is simple easy, obtains The result gone out is probability value, can be taking human as cutoff is determined, training speed is fast, and re -training cost is small, passes through more new formula side Method training pattern so that model is more accurate.
The operation principle of the present embodiment:It is crawled first by network or the acquisition of P2P network loan platform presentation modes is more P2P network loan platforms are divided into normal operation platform and the platform that runs away, extraction by the operation data of a P2P network loan platforms The feature and characteristic for going out each platform achieve, and next can use in model training and test, by the spy of each platform Sign is put into the alternative table of feature, constitutes an alternative features table, correlation is carried out to each feature in alternative features table Analysis, extracts main feature, builds the risk forecast model of logic-based recurrence, is borrowed using collected multiple P2P networks The main feature and main feature data for borrowing platform are trained risk forecast model, after determining model, are climbed by network It takes or P2P network loan platform presentation modes obtains the operation data of P2P network loan platforms for needing to assess, then to flat The operation data of platform is pre-processed, and main feature is precisely extracted by data prediction and main feature data are put into danger In prediction model, value-at-risk is obtained, value-at-risk is input in determining device, to export degree of risk.
Obviously, described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based on this Embodiment in invention, every other reality obtained by those of ordinary skill in the art without making creative efforts Example is applied, shall fall within the protection scope of the present invention.

Claims (5)

1. a kind of system for the P2P platform operation risk assessment that logic-based returns, which is characterized in that include the following steps:
1) first time data acquire:Obtain the operation data of multiple P2P network loan platforms;
2) alternative features table is established:The operation data obtained to the acquisition of first time data is screened and is extracted, and feature is extracted The corresponding data with feature, data definition corresponding with feature are characterized data, the pattern feature according to characteristic Tagsort is established into alternative features table;
3) degree of correlation analysis and Feature Selection:To feature using Pearson correlation coefficients, Spearman's correlation coefficient, Ken Deer Related coefficient and p value carry out correlation analysis, and by above-mentioned four kinds of analysis methods, each feature obtains 4 assay values, choose 0.4 or more correlation analysis absolute value, while 0.005 feature below of P values, main feature is defined as by this Partial Feature;
4) model training collection is built:The normal platform of first predetermined amount and escaping for the first predetermined amount in taking first time data to acquire The data of platform extract the main feature data of these platforms as training set, training set are organized into { (X(1),Y(1)),(X(2),Y(2))...(X(m),Y(m)), X(m)It is the vector of main feature data, Y(m)It is the standing state of platform, it is 1 to escape, normally It is 0;
5) risk evaluation model is established:The basic function of logistic regression is established, mathematical form is:It is patrolling It collects in returning, the sample that training set is crossed by m group echos is constituted:{(X(1),Y(1)),(X(2),Y(2))...(X(m),Y(m)), input is special Sign is X(m), the dimension of feature vector, X is n+1, Xn+ 1=1.0 is intercept item, and logistic regression is two classification problems of processing, category Label are Y(m)∈ { 0,1 }, so pattern function is as follows:Using the method for seeking maximum likelihood function, logic Returning likelihood function is:Wherein m indicates sample size, and logic is returned After returning likelihood function to take logarithm, it is expressed as:Above-mentioned function is one Above-mentioned function using gradient rising to acquire maximum likelihood function value, or is multiplied by -1, convex letter under becoming by a Convex Functions Number, acquires minimal negative likelihood function value, so the loss function of logistic regression is using gradient decline: Using stochastic gradient descent method, to solve the parameter value of equation group, the more new formula that gradient declines θ is as follows:
Wherein, the derivation process of J (θ) is as follows:
,
So the more new formula that gradient declines θ is as follows:Wherein α is learning rate, with Update, J (θ) gradually approaches minimum value, is finally stopped update, so as to find out the parameter of model, so that it is determined that final mask.
7) secondary data acquires:Obtain the operation data for the P2P network loan platforms for needing to assess;
8) data prediction:Main feature and main feature data are extracted in the operation data that secondary data acquisition obtains;
9) risk exports:Main feature after data prediction is put into the risk profile after training with main feature data In model, value-at-risk is obtained, value-at-risk is input in determining device, to export degree of risk.
2. the system for the P2P platform operation risk assessment that logic-based returns according to claim 1, which is characterized in that First time data are acquired and in second data acquisition, are crawled by network or P2P network loan platform presentation modes Obtain the operation data of P2P network loan platforms.
3. the system for the P2P platform operation risk assessment that logic-based returns according to claim 1, which is characterized in that institute It includes three kinds of numeric type, character type and judgement type to state pattern feature.
4. the system for the P2P platform operation risk assessment that logic-based returns according to claim 1, which is characterized in that the One predetermined amount is 100~150, and the second predetermined amount is 75~100.
5. the system for the P2P platform operation risk assessment that logic-based returns according to claim 1, which is characterized in that the Two predetermined amounts are at least the half of the first predetermined amount.
CN201810301074.7A 2018-04-04 2018-04-04 A kind of system for the P2P platform operation risk assessment that logic-based returns Pending CN108492049A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810301074.7A CN108492049A (en) 2018-04-04 2018-04-04 A kind of system for the P2P platform operation risk assessment that logic-based returns

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810301074.7A CN108492049A (en) 2018-04-04 2018-04-04 A kind of system for the P2P platform operation risk assessment that logic-based returns

Publications (1)

Publication Number Publication Date
CN108492049A true CN108492049A (en) 2018-09-04

Family

ID=63314864

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810301074.7A Pending CN108492049A (en) 2018-04-04 2018-04-04 A kind of system for the P2P platform operation risk assessment that logic-based returns

Country Status (1)

Country Link
CN (1) CN108492049A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111369341A (en) * 2020-03-05 2020-07-03 厦门正北科技有限公司 Intelligent risk scoring system for clients before automobile financial loan
CN112418738A (en) * 2020-12-17 2021-02-26 泸州银行股份有限公司 Staff operation risk prediction method based on logistic regression
CN113205219A (en) * 2021-05-12 2021-08-03 大连大学 Agricultural water quality prediction method based on gradient descent optimization logistic regression algorithm
CN116070725A (en) * 2022-08-29 2023-05-05 山东科技大学 Mining pressure risk prediction method based on logistic regression

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111369341A (en) * 2020-03-05 2020-07-03 厦门正北科技有限公司 Intelligent risk scoring system for clients before automobile financial loan
CN112418738A (en) * 2020-12-17 2021-02-26 泸州银行股份有限公司 Staff operation risk prediction method based on logistic regression
CN113205219A (en) * 2021-05-12 2021-08-03 大连大学 Agricultural water quality prediction method based on gradient descent optimization logistic regression algorithm
CN116070725A (en) * 2022-08-29 2023-05-05 山东科技大学 Mining pressure risk prediction method based on logistic regression

Similar Documents

Publication Publication Date Title
CN108492049A (en) A kind of system for the P2P platform operation risk assessment that logic-based returns
CN106779755A (en) A kind of network electric business borrows or lends money methods of risk assessment and model
CN108734380B (en) Risk account determination method and device and computing equipment
CN109492945A (en) Business risk identifies monitoring method, device, equipment and storage medium
CN110390465A (en) Air control analysis and processing method, device and the computer equipment of business datum
CN107194803A (en) A kind of P2P nets borrow the device of borrower's assessing credit risks
CN108876600A (en) Warning information method for pushing, device, computer equipment and medium
CN106530078A (en) Loan risk early warning method and system based on multi-industry data
CN102163310A (en) Information pushing method and device based on credit rating of user
CN106503873A (en) A kind of prediction user follows treaty method, device and the computing device of probability
CN110111113B (en) Abnormal transaction node detection method and device
CN107784312A (en) Machine learning model training method and device
CN105354210A (en) Mobile game payment account behavior data processing method and apparatus
CN110147823A (en) A kind of air control model training method, device and equipment
CN110415111A (en) Merge the method for logistic regression credit examination & approval with expert features based on user data
CN108009911A (en) A kind of method of identification P2P network loan borrower's default risks
CN108492001A (en) A method of being used for guaranteed loan network risk management
CN107633455A (en) Credit estimation method and device based on data model
CN111476296A (en) Sample generation method, classification model training method, identification method and corresponding devices
CN106484919A (en) A kind of industrial sustainability sorting technique based on webpage autonomous word and system
CN107545038A (en) A kind of file classification method and equipment
CN108711100A (en) A kind of system of the P2P platform operation risk assessment based on neural network
CN112561320A (en) Training method of mechanism risk prediction model, mechanism risk prediction method and device
CN116468300A (en) Army general hospital discipline assessment method and system based on neural network
CN108492050A (en) A kind of P2P network loan platforms operations risks assessment system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180904