CN106408184A - User credit evaluation model based on multi-source heterogeneous data - Google Patents

User credit evaluation model based on multi-source heterogeneous data Download PDF

Info

Publication number
CN106408184A
CN106408184A CN201610817430.1A CN201610817430A CN106408184A CN 106408184 A CN106408184 A CN 106408184A CN 201610817430 A CN201610817430 A CN 201610817430A CN 106408184 A CN106408184 A CN 106408184A
Authority
CN
China
Prior art keywords
user
feature
source heterogeneous
heterogeneous data
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610817430.1A
Other languages
Chinese (zh)
Inventor
郑子彬
杨亚涛
黄春振
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Sun Yat Sen University
Original Assignee
National Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Sun Yat Sen University filed Critical National Sun Yat Sen University
Priority to CN201610817430.1A priority Critical patent/CN106408184A/en
Publication of CN106408184A publication Critical patent/CN106408184A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0637Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Educational Administration (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a user credit evaluation model based on multi-source heterogeneous data, comprising the following steps: (1) multi-source heterogeneous data acquiring and merging; (2) user feature processing; and (3) model training. According to the model framework put forward by the invention, in the following feature expansion and selection, the data dimension of a user is extended first, and then, useful features are selected. Thus, the dimension of features is reduced, and the time complexity of the model is reduced. Data missing and abnormality is handled in feature processing, and therefore, the robustness of the model to missing values is improved.

Description

A kind of user credit assessment models based on multi-source heterogeneous data
Technical field
The present invention relates to credit evaluation field, assess mould particularly to a kind of user credit based on multi-source heterogeneous data Type.
Background technology
User credit assessment refers to that credit evaluation mechanism uses expert judgments or Mathematical Method, and personal and enterprise are carried out About various capacitys of consent and prestige degree carry out thoroughly evaluating, and with simple and clear symbol or literal expression out, to meet The market behavior of society need.Credit evaluation has been widely used in financial field.Traditional financial institution's assessment credit relies on In to user in user's financial records of this mechanism and behavior record.With the deep development of big data, traditional credit is commented Estimate using data limitation also be faced with renewal substitute.
With the depth development of the Internet, the various actions record of user all produces daily in a network.These data are The significant data of the performance of user's real behavior, naturally also user credit assessment.How using the multi-source heterogeneous data of user Carrying out user credit assessment becomes new trend.It is proposed that the data of the following dimension of deep exploitation carries out user credit commenting Estimate:
1) Back ground Information:The demographics essential informations such as age of user, native place, current work address;
2) network behavior information:Refer to user to browse webpage, the instrument browsing webpage use, browse distribution and duration etc. Information;
3) student status educational background information:User's education information;
4) social network information:User in public social networkies, such as microblogging, know the behavior of grade and social information;
5) Third-party payment information:User is in the consumption recording information of Third-party payment platform.
6) investigation on the net questionnaire information:Questionnaire imposes reference relevant information and essential information.
All from the Internet, this is had substantially the master data of 6 above dimensions with traditional credit evaluation data Difference.The data dimension of Internet user reaches thousands of dimensions, and these Data Sources are different, can assess use in all its bearings The data of family credit, more various dimensions can be more fully described the credit standing of a user;
But, data dimension rises to thousands of dimensions from tens dimensions, brings challenges also to the construction of model simultaneously.Mould Type facing challenges may be summarized to be:
1. the high-dimensional problem of data.Traditional credit evaluation model because being the model set up in the features of tens dimensions, The time of model training is shorter, so not having the problem of excessive consideration data dimension.And rely on internet information at present to comment Estimate user profile it is considered to user profile be not only the related information of customer transaction, also user social contact network, Behavior preference Etc. dimensional information, the dimension of data can reach thousands of dimensions, the data of such higher-dimension, needs a good feature selection mode to exist Reduce characteristic dimension in the case of not reducing model evaluation effect, allow the training speed of model and actual effect more to strengthen;
2. the problem of shortage of data value and exceptional value.User's dimension due to considering is a lot, so user can not possibly be every With the presence of value in individual dimension, the shortage of data value of user is more in many cases, and because some data are by recessiveness Mode obtain, so data in the collection or transmitting procedure it cannot be guaranteed that completely correct, data is also abnormal along with some Value exists.Current model also seldom goes to propose specific solution for this problem in detail;But missing values and exceptional value The process meaning specifically important to the effect promoting of model evaluation.
Content of the invention
The present invention is to solve the above problems it is proposed that a kind of user credit assessment models based on multi-source heterogeneous data, its Comprise the following steps:
(1) acquisition of multi-source heterogeneous data and merging;
(2) process of user characteristicses;
(3) training of model.
Further, the acquisition of described multi-source heterogeneous data includes:
Using crawler technology, crawl in webpage with user-dependent information;
From providing, the premise that user obtains reference report is to provide appropriate personal essential information to user;
User authorizes the access of the data of the third-party institution.
Further, the merging of described multi-source heterogeneous data:
Authorized user message and user are provided with data carries out mailbox number, cell-phone number, the arbitrary of identity card ID are mated;
Mailbox number, user name, user's authorization merging are carried out to the information that crawls on the net.
Further, the process of user characteristicses includes the process of missing values abnormalities characteristic, category feature discrete codes, sequential Depths of features is excavated, is obtained statistical feature.
Further, the training package vinculum model training of described model, decision-tree model training.
Further, described multi-source heterogeneous data includes the essential information of user, school work information, payment information, social network Network information, operation information, network behavior information.
Further, described missing values abnormalities characteristic processes and specifically includes:
A. miss rate carries out feature filling below 20%, for numeric type feature, fills average, special for classification type Levy filling mode;
B. miss rate carries out discard processing and discrete codes conversion more than 97%, and discard processing is to remove disappearance occupation rate Feature more than 97%, and in the case that miss rate is a lot, discrete codes are carried out to these features;
C. missing values statistical matrix:By user characteristicses matrix, disappearance be set to 1, do not lack is set to 0.
Further, described category feature discrete codes specifically include:One possible value there is the spy of N kind situation Levy, be encoded to N number of binary feature, these feature mutual exclusions, only one of which activation, makes data become sparse every time.
Further, described temporal aspect depth is excavated and is specifically included:
1st, adjacent period is carried out subtracting each other with process, represents the difference conversion of different times or a section;
2nd, adjacent period is divided by process, represents chain rate/slope conversion of different times or a section;
3rd, carry out accumulated process, represent and value changes;
Further, the described statistics feature that obtains specifically includes:The miss rate of counting user information, whether user is big Volume transaction record user, user's active time counts, the user locations rate of change, and statistical method includes global statistics or branch mailbox system Meter.
Further, described linear model training includes LASSO, Liblinear, Linear-SVM;Decision-tree model is instructed White silk includes Boosting, XGBoost.
The invention has the beneficial effects as follows:Model framework proposed by the present invention feature below extension with select, first to The data dimension at family is extended, and then useful feature is selected again, thus lowering the dimension of feature, lowers model Time complexity;In characteristic processing, shortage of data is processed with abnormal situation simultaneously, provide model to missing values Robustness.
Brief description
Fig. 1 is a kind of user credit assessment figure based on multi-source heterogeneous data;
Fig. 2 is the missing values statistical moment system of battle formations;
Fig. 3 is address type latent structure mode figure;
Fig. 4 is the mode mapping graph that is divided by.
Specific embodiment
The present invention will be described in detail below:
User credit assessment based on multi-source heterogeneous data comprises as shown in Figure 1 three big steps:
(1) acquisition of multi-source heterogeneous data and merging;
(2) process of user characteristicses;
(3) training of model.
Wherein:
(1) data basis layer
Data basis layer includes the essential information of user under network environment, school work information, payment information, social networkies letter Breath, operation information, network behavior information etc..These information both are from different data sources, can effective expression user each The information of individual aspect.This is also so that model is capable of the key of more accurate assurance user credit situation.These information are passed through to use Any information in family ID, identity card ID, mailbox number and cell-phone number connects.Multi-source data is connected to user, is that next step is used The assessment of family various dimensions credit is done data and is prepared.
Specifically, the acquisition of wherein multi-source heterogeneous data:
1) crawler technology, crawl in webpage with user-dependent information.
2) user provides certainly, and the premise that user obtains reference report is to provide appropriate personal essential information.
3) user authorizes the access of the data of the third-party institution.
The merging of multi-source heterogeneous data:
1) authorized user message and user are provided with data carries out mailbox number, cell-phone number, the arbitrary of identity card ID are mated.
2) information that crawls on the net is carried out with mailbox number, user name, IP (user's mandate) merging.
(2) data analysis layer
Data analysis layer includes the processing mode of multiple data.It is exactly to look in mixed and disorderly, unordered data in summary To orderly, structurized feature.Thus the information of statement user definitely.The saddlebag of this layer contains:
1. missing values abnormalities characteristic is processed
The merging of multi-source data must cause a large amount of missing datas.The reason cause disappearance has a lot, such as, user does not have There is a payment record of certain bank, or do not collect the essential information of this user, just do not have when even certain user fills in Write some information.For different degree of lacking, the data prediction of multi-form should be carried out.
" -1 " in the form that missing values occur such as numeric type feature, or the null character string in classification type feature, " NULL " etc..We can be processed to the feature of different miss rates:
A. miss rate carries out feature filling below 0.2, for numeric type feature, fills average.Special for classification type Levy filling mode.Such filling proportion and filling mode obtain in test optimal effectiveness;
B. miss rate carries out discard processing and discrete codes conversion more than 97%.Discard processing is to remove disappearance occupation rate Feature more than 97%.And in the case that miss rate is a lot, this feature is more likely to discretization, we are also carried out to these features Discrete codes.
C. missing values statistical matrix:If Fig. 2 statement is by user characteristicses matrix, disappearance be set to 1, do not lack is set to 0.Do The feature of this respect is because it is considered that missing values are also a kind of information.
2. category feature discrete codes
The primary operational of discrete codes is the feature that a possible value has N kind situation, is encoded to N number of binary Feature, these feature mutual exclusions, only one of which activation, so can make data become sparse every time.The benefit of so coding is right In tree-model, the identification ability of feature is more strengthened, also function to the effect of augmented features simultaneously.During feature construction, we The feature that value in data (removing address) feature of classification type and numeric type feature is less than 12 values carries out discrete codes. The reason remove address is that the characteristic dimension obtaining can be excessive if by address direct coding, and the complexity increasing model does not but have Have and lifted well.Such as Fig. 3 is carried out more careful conversion by the feature of address.
3. temporal aspect depth is excavated
The data collected has significantly relevant with time data.Such as, have not in the payment record of a people Payment information of the same period, the diversity of different times behavior record.The trend feature of these sequential can effectively hold one Personal credit trend situation.So, We conducted the feature to different times and carry out more careful process:
1st, adjacent period is carried out subtracting each other with process, represents the difference conversion of different times or a section;
2nd, adjacent period is divided by process, represents chain rate/slope conversion of different times or a section;
3rd, carry out accumulated process, represent and value changes;
Wherein, for division arithmetic, because some show missing values (unified presentation be -1) it is impossible to direct to this row feature Remove.We take in the following manner that situation about can not directly remove is carried out as Fig. 4 formal layout:
4. statistical feature
Statistical feature can effectively hold the information of the overall situation, such as the cash in banks of someone is 50,000, overall sample If this is all thousand of.So this people be can be regarded as relatively rich.So, if overall sample deposit is all 100,000, then this Individual can be regarded as relatively poor.Before there is no global statistics, these information we be difficult to hold.So, statistical feature It is also the important indicator of user's assessment.
Outside features described above construction, we have proposed some statistical features, such as the disappearance of counting user information Rate, whether user is block trade record user, and user's active time counts, user locations rate of change etc..It is all that definition is used Family credit rating has very big contribution.In addition to global statistics, can be counted with branch mailbox.
(3) model training layer
During model training, we fully utilize linear model and tree-model.So utilize different models pair Feature carries out omnibearing training.Thus more effectively using feature and then obtaining more accurate result.Model training layer institute Model has:
1. linear model:
Linear model is the general name of a class statistical model, and it includes linear regression model (LRM), analysis of variance model, covariance Analysis model and linear assembly language (or claiming variance component model) etc..Many biologies, medical science, economy, management, geology, The phenomenon in the fields such as meteorology, agricultural, industry, engineering technology can be with linear model come approximate description.Therefore linear model becomes For one of model of being most widely used in modern statistics.
The linear model that the present invention adopts includes:
LASSO
Linear innovatory algorithm, essence is also one kind of linear classifier, but it is integrated with feature selection and regularization Function;Improve accuracy rate and the interpretability of statistical model.
Liblinear
Algorithm simple and efficiently, apply in practice widely, quickly, big data quantity can be carried;Can be effective Process continuous Value Data, the feature self-explanatory etc. that discretization is crossed;Liblinear is in the degree of fitting of data and model explanation Degree can be taken into account, and takes into account to obtain reasonable algorithm.
Linear-SVM
It is not using kernel matrix, so it is quick more a lot of than LIBSVM;If training set has done big measure feature Engineering, dimension is very high, more suitable with linear-SVM, also reduces over-fitting risk simultaneously.
2. decision-tree model:
Decision tree (Decision Tree) is on the basis of known various situation probability of happening, by constituting decision tree Expected value to ask for net present value (NPV) is more than or equal to zero probability, assessment item risk, judges the method for decision analysis of its feasibility, It is a kind of diagram method intuitively using probability analyses.Because this decision branch is drawn as the branch like one tree for the figure, therefore claim Decision tree.In machine learning, decision tree is a forecast model, and what he represented is the one kind between object properties and object value Mapping relations.
We make use of the Boosting model in decision-tree model;This model is during training objective function to instruction Practice the Taylor expansion that second order has been done in loss, and add canonical item constraint outside object function and just can integrally seek optimal solution; XGBoost also has speed fast, and transplantation writes code, gram fault-tolerant advantage less.
Above-described embodiment is the present invention preferably embodiment, but embodiments of the present invention are not subject to above-described embodiment Limit, other any spirit without departing from the present invention and the change made under principle, modification, replacement, combine, simplify, All should be equivalent substitute mode, be included within protection scope of the present invention.

Claims (10)

1. a kind of user credit assessment models based on multi-source heterogeneous data, it comprises the following steps:
(1) acquisition of multi-source heterogeneous data and merging;
(2) process of user characteristicses;
(3) training of model.
2. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 1 are it is characterised in that institute The acquisition stating multi-source heterogeneous data includes:
Using crawler technology, crawl in webpage with user-dependent information;
From providing, the premise that user obtains reference report is to provide appropriate personal essential information to user;
User authorizes the access of the data of the third-party institution;
The merging of described multi-source heterogeneous data includes:
Authorized user message and user are provided with data carries out mailbox number, cell-phone number, the arbitrary of identity card ID are mated;
Mailbox number, user name, user's authorization merging are carried out to the information that crawls on the net.
3. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 1 are it is characterised in that use The process of family feature includes the process of missing values abnormalities characteristic, category feature discrete codes, temporal aspect depth are excavated, obtain system Meter property feature.
4. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 1 are it is characterised in that institute State training package vinculum model training, the decision-tree model training of model.
5. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 1 are it is characterised in that institute State multi-source heterogeneous data and include the essential information of user, school work information, payment information, social network information, operation information, network Behavioural information.
6. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 3 are it is characterised in that institute State the process of missing values abnormalities characteristic to specifically include:
A. miss rate carries out feature filling below 20%, for numeric type feature, fills average, fills out for classification type feature Fill mode;
B. miss rate carries out discard processing and discrete codes conversion more than 97%, and discard processing is to remove disappearance occupation rate to exceed 97% feature, and in the case that miss rate is a lot, discrete codes are carried out to these features;
C. missing values statistical matrix:By user characteristicses matrix, disappearance be set to 1, do not lack is set to 0.
7. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 3 are it is characterised in that institute State category feature discrete codes to specifically include:One possible value there is is the feature of N kind situation, be encoded to N number of binary Feature, these feature mutual exclusions, only one of which activation, makes data become sparse every time.
8. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 3 are it is characterised in that institute State the excavation of temporal aspect depth to specifically include:
1st, adjacent period is carried out subtracting each other with process, represents the difference conversion of different times or a section;
2nd, adjacent period is divided by process, represents chain rate/slope conversion of different times or a section;
3rd, carry out accumulated process, represent and value changes.
9. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 3 are it is characterised in that institute State the statistical feature of acquisition to specifically include:The miss rate of counting user information, whether user is block trade record user, user Active time counts, the user locations rate of change, and statistical method includes global statistics or branch mailbox statistics.
10. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 4 are it is characterised in that institute State linear model training and include LASSO, Liblinear, Linear-SVM;Decision-tree model training include Boosting, XGBoost.
CN201610817430.1A 2016-09-12 2016-09-12 User credit evaluation model based on multi-source heterogeneous data Pending CN106408184A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610817430.1A CN106408184A (en) 2016-09-12 2016-09-12 User credit evaluation model based on multi-source heterogeneous data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610817430.1A CN106408184A (en) 2016-09-12 2016-09-12 User credit evaluation model based on multi-source heterogeneous data

Publications (1)

Publication Number Publication Date
CN106408184A true CN106408184A (en) 2017-02-15

Family

ID=57999368

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610817430.1A Pending CN106408184A (en) 2016-09-12 2016-09-12 User credit evaluation model based on multi-source heterogeneous data

Country Status (1)

Country Link
CN (1) CN106408184A (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107133867A (en) * 2017-04-16 2017-09-05 信阳师范学院 Credit method for anti-counterfeit based on SVMs
CN107423339A (en) * 2017-04-29 2017-12-01 天津大学 Popular microblogging Forecasting Methodology based on extreme Gradient Propulsion and random forest
CN107730154A (en) * 2017-11-23 2018-02-23 安趣盈(上海)投资咨询有限公司 Based on the parallel air control application method of more machine learning models and system
CN107729519A (en) * 2017-10-27 2018-02-23 上海数据交易中心有限公司 Appraisal procedure and device, terminal based on multi-source multidimensional data
CN107798137A (en) * 2017-11-23 2018-03-13 霍尔果斯智融未来信息科技有限公司 A kind of multi-source heterogeneous data fusion architecture system based on additive models
CN107919016A (en) * 2017-11-15 2018-04-17 夏莹杰 Traffic flow parameter missing complementing method based on multi-source detector data
CN108009914A (en) * 2017-12-19 2018-05-08 马上消费金融股份有限公司 A kind of assessing credit risks method, system, equipment and computer-readable storage medium
CN108154430A (en) * 2017-12-28 2018-06-12 上海氪信信息技术有限公司 A kind of credit scoring construction method based on machine learning and big data technology
CN108280759A (en) * 2018-01-17 2018-07-13 深圳市和讯华谷信息技术有限公司 Air control model optimization method, terminal and computer readable storage medium
CN108449332A (en) * 2018-03-09 2018-08-24 中山大学 A kind of lightweight Mobile Payment Protocol design method based on double gateways
CN108648068A (en) * 2018-05-16 2018-10-12 长沙农村商业银行股份有限公司 A kind of assessing credit risks method and system
CN108846743A (en) * 2018-06-12 2018-11-20 北京京东金融科技控股有限公司 Method and apparatus for generating information
CN109086290A (en) * 2018-06-08 2018-12-25 广东万丈金数信息技术股份有限公司 Registration information judgment method of authenticity and system based on multi-source data decision tree
CN109753356A (en) * 2018-12-25 2019-05-14 北京友信科技有限公司 A kind of container resource regulating method, device and computer readable storage medium
CN109947811A (en) * 2017-11-29 2019-06-28 北京京东金融科技控股有限公司 Generic features library generating method and device, storage medium, electronic equipment
CN110119511A (en) * 2019-05-17 2019-08-13 网易传媒科技(北京)有限公司 Prediction technique, medium, device and the calculating equipment of article hot spot score
CN110321342A (en) * 2019-05-27 2019-10-11 平安科技(深圳)有限公司 Business valuation studies method, apparatus and storage medium based on intelligent characteristic selection
CN110363662A (en) * 2019-08-19 2019-10-22 上海理工大学 A kind of personal credit points-scoring system
CN110634060A (en) * 2018-06-21 2019-12-31 马上消费金融股份有限公司 User credit risk assessment method, system, device and storage medium
CN110909040A (en) * 2019-11-08 2020-03-24 支付宝(杭州)信息技术有限公司 Business delivery auxiliary method and device and electronic equipment
CN111045716A (en) * 2019-11-04 2020-04-21 中山大学 Related patch recommendation method based on heterogeneous data
WO2021043094A1 (en) * 2019-09-06 2021-03-11 平安科技(深圳)有限公司 Location service-based location identification method and apparatus, device, and storage medium

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107133867A (en) * 2017-04-16 2017-09-05 信阳师范学院 Credit method for anti-counterfeit based on SVMs
CN107423339A (en) * 2017-04-29 2017-12-01 天津大学 Popular microblogging Forecasting Methodology based on extreme Gradient Propulsion and random forest
CN107729519A (en) * 2017-10-27 2018-02-23 上海数据交易中心有限公司 Appraisal procedure and device, terminal based on multi-source multidimensional data
CN107729519B (en) * 2017-10-27 2020-06-09 上海数据交易中心有限公司 Multi-source multi-dimensional data-based evaluation method and device, and terminal
CN107919016B (en) * 2017-11-15 2020-02-18 杭州远眺科技有限公司 Traffic flow parameter missing filling method based on multi-source detector data
CN107919016A (en) * 2017-11-15 2018-04-17 夏莹杰 Traffic flow parameter missing complementing method based on multi-source detector data
CN107798137B (en) * 2017-11-23 2018-12-18 霍尔果斯智融未来信息科技有限公司 A kind of multi-source heterogeneous data fusion architecture system based on additive models
CN107798137A (en) * 2017-11-23 2018-03-13 霍尔果斯智融未来信息科技有限公司 A kind of multi-source heterogeneous data fusion architecture system based on additive models
CN107730154A (en) * 2017-11-23 2018-02-23 安趣盈(上海)投资咨询有限公司 Based on the parallel air control application method of more machine learning models and system
CN109947811A (en) * 2017-11-29 2019-06-28 北京京东金融科技控股有限公司 Generic features library generating method and device, storage medium, electronic equipment
CN108009914A (en) * 2017-12-19 2018-05-08 马上消费金融股份有限公司 A kind of assessing credit risks method, system, equipment and computer-readable storage medium
CN108154430A (en) * 2017-12-28 2018-06-12 上海氪信信息技术有限公司 A kind of credit scoring construction method based on machine learning and big data technology
CN108280759A (en) * 2018-01-17 2018-07-13 深圳市和讯华谷信息技术有限公司 Air control model optimization method, terminal and computer readable storage medium
CN108449332A (en) * 2018-03-09 2018-08-24 中山大学 A kind of lightweight Mobile Payment Protocol design method based on double gateways
CN108648068A (en) * 2018-05-16 2018-10-12 长沙农村商业银行股份有限公司 A kind of assessing credit risks method and system
CN109086290A (en) * 2018-06-08 2018-12-25 广东万丈金数信息技术股份有限公司 Registration information judgment method of authenticity and system based on multi-source data decision tree
CN108846743A (en) * 2018-06-12 2018-11-20 北京京东金融科技控股有限公司 Method and apparatus for generating information
CN110634060A (en) * 2018-06-21 2019-12-31 马上消费金融股份有限公司 User credit risk assessment method, system, device and storage medium
CN109753356A (en) * 2018-12-25 2019-05-14 北京友信科技有限公司 A kind of container resource regulating method, device and computer readable storage medium
CN110119511A (en) * 2019-05-17 2019-08-13 网易传媒科技(北京)有限公司 Prediction technique, medium, device and the calculating equipment of article hot spot score
CN110119511B (en) * 2019-05-17 2023-05-02 网易传媒科技(北京)有限公司 Article hotspot score prediction method, medium, device and computing equipment
CN110321342A (en) * 2019-05-27 2019-10-11 平安科技(深圳)有限公司 Business valuation studies method, apparatus and storage medium based on intelligent characteristic selection
CN110363662A (en) * 2019-08-19 2019-10-22 上海理工大学 A kind of personal credit points-scoring system
WO2021043094A1 (en) * 2019-09-06 2021-03-11 平安科技(深圳)有限公司 Location service-based location identification method and apparatus, device, and storage medium
CN111045716A (en) * 2019-11-04 2020-04-21 中山大学 Related patch recommendation method based on heterogeneous data
CN111045716B (en) * 2019-11-04 2022-02-22 中山大学 Related patch recommendation method based on heterogeneous data
CN110909040A (en) * 2019-11-08 2020-03-24 支付宝(杭州)信息技术有限公司 Business delivery auxiliary method and device and electronic equipment
CN110909040B (en) * 2019-11-08 2022-03-04 支付宝(杭州)信息技术有限公司 Business delivery auxiliary method and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN106408184A (en) User credit evaluation model based on multi-source heterogeneous data
US20200192894A1 (en) System and method for using data incident based modeling and prediction
US20210042767A1 (en) Digital content prioritization to accelerate hyper-targeting
CN107818344A (en) The method and system that user behavior is classified and predicted
CN109657918A (en) Method for prewarning risk, device and the computer equipment of association assessment object
CN106600052A (en) User attribute and social network detection system based on space-time locus
CN106897930A (en) A kind of method and device of credit evaluation
CN101493913A (en) Method and system for assessing user credit in internet
CN107704512A (en) Financial product based on social data recommends method, electronic installation and medium
CN110310163A (en) A kind of accurate method, equipment and readable medium for formulating marketing strategy
US20170032270A1 (en) Method for predicting personality trait and device therefor
US11538044B2 (en) System and method for generation of case-based data for training machine learning classifiers
CN108021651A (en) Network public opinion risk assessment method and device
US11494850B1 (en) Applied artificial intelligence technology for detecting anomalies in payroll data
Naganathan Comparative analysis of Big data, Big data analytics: Challenges and trends
Zou et al. A novel network security algorithm based on improved support vector machine from smart city perspective
Kumar et al. The zeitgeist juncture of “big data” and its future trends
Zaarour Financial statements earnings manipulation detection using a layer of machine learning
Ait et al. An empirical study on the survival rate of GitHub projects
Qudsi et al. Predictive data mining of chronic diseases using decision tree: a case study of health insurance company in Indonesia
Mai et al. Detecting the intellectual pathway of resilience thinking in urban and regional studies: A critical reflection on resilience literature
CN107025494A (en) Data predication method, financing recommendation method, device and terminal device
Sharma et al. Importance of Big Data in financial fraud detection
Rajaleximi et al. Feature selection using optimized multiple rank score model for credit scoring
CN115358878A (en) Financing user risk preference level analysis method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170215