CN106408184A - User credit evaluation model based on multi-source heterogeneous data - Google Patents
User credit evaluation model based on multi-source heterogeneous data Download PDFInfo
- Publication number
- CN106408184A CN106408184A CN201610817430.1A CN201610817430A CN106408184A CN 106408184 A CN106408184 A CN 106408184A CN 201610817430 A CN201610817430 A CN 201610817430A CN 106408184 A CN106408184 A CN 106408184A
- Authority
- CN
- China
- Prior art keywords
- user
- feature
- source heterogeneous
- heterogeneous data
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0637—Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals
Landscapes
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Educational Administration (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to a user credit evaluation model based on multi-source heterogeneous data, comprising the following steps: (1) multi-source heterogeneous data acquiring and merging; (2) user feature processing; and (3) model training. According to the model framework put forward by the invention, in the following feature expansion and selection, the data dimension of a user is extended first, and then, useful features are selected. Thus, the dimension of features is reduced, and the time complexity of the model is reduced. Data missing and abnormality is handled in feature processing, and therefore, the robustness of the model to missing values is improved.
Description
Technical field
The present invention relates to credit evaluation field, assess mould particularly to a kind of user credit based on multi-source heterogeneous data
Type.
Background technology
User credit assessment refers to that credit evaluation mechanism uses expert judgments or Mathematical Method, and personal and enterprise are carried out
About various capacitys of consent and prestige degree carry out thoroughly evaluating, and with simple and clear symbol or literal expression out, to meet
The market behavior of society need.Credit evaluation has been widely used in financial field.Traditional financial institution's assessment credit relies on
In to user in user's financial records of this mechanism and behavior record.With the deep development of big data, traditional credit is commented
Estimate using data limitation also be faced with renewal substitute.
With the depth development of the Internet, the various actions record of user all produces daily in a network.These data are
The significant data of the performance of user's real behavior, naturally also user credit assessment.How using the multi-source heterogeneous data of user
Carrying out user credit assessment becomes new trend.It is proposed that the data of the following dimension of deep exploitation carries out user credit commenting
Estimate:
1) Back ground Information:The demographics essential informations such as age of user, native place, current work address;
2) network behavior information:Refer to user to browse webpage, the instrument browsing webpage use, browse distribution and duration etc.
Information;
3) student status educational background information:User's education information;
4) social network information:User in public social networkies, such as microblogging, know the behavior of grade and social information;
5) Third-party payment information:User is in the consumption recording information of Third-party payment platform.
6) investigation on the net questionnaire information:Questionnaire imposes reference relevant information and essential information.
All from the Internet, this is had substantially the master data of 6 above dimensions with traditional credit evaluation data
Difference.The data dimension of Internet user reaches thousands of dimensions, and these Data Sources are different, can assess use in all its bearings
The data of family credit, more various dimensions can be more fully described the credit standing of a user;
But, data dimension rises to thousands of dimensions from tens dimensions, brings challenges also to the construction of model simultaneously.Mould
Type facing challenges may be summarized to be:
1. the high-dimensional problem of data.Traditional credit evaluation model because being the model set up in the features of tens dimensions,
The time of model training is shorter, so not having the problem of excessive consideration data dimension.And rely on internet information at present to comment
Estimate user profile it is considered to user profile be not only the related information of customer transaction, also user social contact network, Behavior preference
Etc. dimensional information, the dimension of data can reach thousands of dimensions, the data of such higher-dimension, needs a good feature selection mode to exist
Reduce characteristic dimension in the case of not reducing model evaluation effect, allow the training speed of model and actual effect more to strengthen;
2. the problem of shortage of data value and exceptional value.User's dimension due to considering is a lot, so user can not possibly be every
With the presence of value in individual dimension, the shortage of data value of user is more in many cases, and because some data are by recessiveness
Mode obtain, so data in the collection or transmitting procedure it cannot be guaranteed that completely correct, data is also abnormal along with some
Value exists.Current model also seldom goes to propose specific solution for this problem in detail;But missing values and exceptional value
The process meaning specifically important to the effect promoting of model evaluation.
Content of the invention
The present invention is to solve the above problems it is proposed that a kind of user credit assessment models based on multi-source heterogeneous data, its
Comprise the following steps:
(1) acquisition of multi-source heterogeneous data and merging;
(2) process of user characteristicses;
(3) training of model.
Further, the acquisition of described multi-source heterogeneous data includes:
Using crawler technology, crawl in webpage with user-dependent information;
From providing, the premise that user obtains reference report is to provide appropriate personal essential information to user;
User authorizes the access of the data of the third-party institution.
Further, the merging of described multi-source heterogeneous data:
Authorized user message and user are provided with data carries out mailbox number, cell-phone number, the arbitrary of identity card ID are mated;
Mailbox number, user name, user's authorization merging are carried out to the information that crawls on the net.
Further, the process of user characteristicses includes the process of missing values abnormalities characteristic, category feature discrete codes, sequential
Depths of features is excavated, is obtained statistical feature.
Further, the training package vinculum model training of described model, decision-tree model training.
Further, described multi-source heterogeneous data includes the essential information of user, school work information, payment information, social network
Network information, operation information, network behavior information.
Further, described missing values abnormalities characteristic processes and specifically includes:
A. miss rate carries out feature filling below 20%, for numeric type feature, fills average, special for classification type
Levy filling mode;
B. miss rate carries out discard processing and discrete codes conversion more than 97%, and discard processing is to remove disappearance occupation rate
Feature more than 97%, and in the case that miss rate is a lot, discrete codes are carried out to these features;
C. missing values statistical matrix:By user characteristicses matrix, disappearance be set to 1, do not lack is set to 0.
Further, described category feature discrete codes specifically include:One possible value there is the spy of N kind situation
Levy, be encoded to N number of binary feature, these feature mutual exclusions, only one of which activation, makes data become sparse every time.
Further, described temporal aspect depth is excavated and is specifically included:
1st, adjacent period is carried out subtracting each other with process, represents the difference conversion of different times or a section;
2nd, adjacent period is divided by process, represents chain rate/slope conversion of different times or a section;
3rd, carry out accumulated process, represent and value changes;
Further, the described statistics feature that obtains specifically includes:The miss rate of counting user information, whether user is big
Volume transaction record user, user's active time counts, the user locations rate of change, and statistical method includes global statistics or branch mailbox system
Meter.
Further, described linear model training includes LASSO, Liblinear, Linear-SVM;Decision-tree model is instructed
White silk includes Boosting, XGBoost.
The invention has the beneficial effects as follows:Model framework proposed by the present invention feature below extension with select, first to
The data dimension at family is extended, and then useful feature is selected again, thus lowering the dimension of feature, lowers model
Time complexity;In characteristic processing, shortage of data is processed with abnormal situation simultaneously, provide model to missing values
Robustness.
Brief description
Fig. 1 is a kind of user credit assessment figure based on multi-source heterogeneous data;
Fig. 2 is the missing values statistical moment system of battle formations;
Fig. 3 is address type latent structure mode figure;
Fig. 4 is the mode mapping graph that is divided by.
Specific embodiment
The present invention will be described in detail below:
User credit assessment based on multi-source heterogeneous data comprises as shown in Figure 1 three big steps:
(1) acquisition of multi-source heterogeneous data and merging;
(2) process of user characteristicses;
(3) training of model.
Wherein:
(1) data basis layer
Data basis layer includes the essential information of user under network environment, school work information, payment information, social networkies letter
Breath, operation information, network behavior information etc..These information both are from different data sources, can effective expression user each
The information of individual aspect.This is also so that model is capable of the key of more accurate assurance user credit situation.These information are passed through to use
Any information in family ID, identity card ID, mailbox number and cell-phone number connects.Multi-source data is connected to user, is that next step is used
The assessment of family various dimensions credit is done data and is prepared.
Specifically, the acquisition of wherein multi-source heterogeneous data:
1) crawler technology, crawl in webpage with user-dependent information.
2) user provides certainly, and the premise that user obtains reference report is to provide appropriate personal essential information.
3) user authorizes the access of the data of the third-party institution.
The merging of multi-source heterogeneous data:
1) authorized user message and user are provided with data carries out mailbox number, cell-phone number, the arbitrary of identity card ID are mated.
2) information that crawls on the net is carried out with mailbox number, user name, IP (user's mandate) merging.
(2) data analysis layer
Data analysis layer includes the processing mode of multiple data.It is exactly to look in mixed and disorderly, unordered data in summary
To orderly, structurized feature.Thus the information of statement user definitely.The saddlebag of this layer contains:
1. missing values abnormalities characteristic is processed
The merging of multi-source data must cause a large amount of missing datas.The reason cause disappearance has a lot, such as, user does not have
There is a payment record of certain bank, or do not collect the essential information of this user, just do not have when even certain user fills in
Write some information.For different degree of lacking, the data prediction of multi-form should be carried out.
" -1 " in the form that missing values occur such as numeric type feature, or the null character string in classification type feature,
" NULL " etc..We can be processed to the feature of different miss rates:
A. miss rate carries out feature filling below 0.2, for numeric type feature, fills average.Special for classification type
Levy filling mode.Such filling proportion and filling mode obtain in test optimal effectiveness;
B. miss rate carries out discard processing and discrete codes conversion more than 97%.Discard processing is to remove disappearance occupation rate
Feature more than 97%.And in the case that miss rate is a lot, this feature is more likely to discretization, we are also carried out to these features
Discrete codes.
C. missing values statistical matrix:If Fig. 2 statement is by user characteristicses matrix, disappearance be set to 1, do not lack is set to 0.Do
The feature of this respect is because it is considered that missing values are also a kind of information.
2. category feature discrete codes
The primary operational of discrete codes is the feature that a possible value has N kind situation, is encoded to N number of binary
Feature, these feature mutual exclusions, only one of which activation, so can make data become sparse every time.The benefit of so coding is right
In tree-model, the identification ability of feature is more strengthened, also function to the effect of augmented features simultaneously.During feature construction, we
The feature that value in data (removing address) feature of classification type and numeric type feature is less than 12 values carries out discrete codes.
The reason remove address is that the characteristic dimension obtaining can be excessive if by address direct coding, and the complexity increasing model does not but have
Have and lifted well.Such as Fig. 3 is carried out more careful conversion by the feature of address.
3. temporal aspect depth is excavated
The data collected has significantly relevant with time data.Such as, have not in the payment record of a people
Payment information of the same period, the diversity of different times behavior record.The trend feature of these sequential can effectively hold one
Personal credit trend situation.So, We conducted the feature to different times and carry out more careful process:
1st, adjacent period is carried out subtracting each other with process, represents the difference conversion of different times or a section;
2nd, adjacent period is divided by process, represents chain rate/slope conversion of different times or a section;
3rd, carry out accumulated process, represent and value changes;
Wherein, for division arithmetic, because some show missing values (unified presentation be -1) it is impossible to direct to this row feature
Remove.We take in the following manner that situation about can not directly remove is carried out as Fig. 4 formal layout:
4. statistical feature
Statistical feature can effectively hold the information of the overall situation, such as the cash in banks of someone is 50,000, overall sample
If this is all thousand of.So this people be can be regarded as relatively rich.So, if overall sample deposit is all 100,000, then this
Individual can be regarded as relatively poor.Before there is no global statistics, these information we be difficult to hold.So, statistical feature
It is also the important indicator of user's assessment.
Outside features described above construction, we have proposed some statistical features, such as the disappearance of counting user information
Rate, whether user is block trade record user, and user's active time counts, user locations rate of change etc..It is all that definition is used
Family credit rating has very big contribution.In addition to global statistics, can be counted with branch mailbox.
(3) model training layer
During model training, we fully utilize linear model and tree-model.So utilize different models pair
Feature carries out omnibearing training.Thus more effectively using feature and then obtaining more accurate result.Model training layer institute
Model has:
1. linear model:
Linear model is the general name of a class statistical model, and it includes linear regression model (LRM), analysis of variance model, covariance
Analysis model and linear assembly language (or claiming variance component model) etc..Many biologies, medical science, economy, management, geology,
The phenomenon in the fields such as meteorology, agricultural, industry, engineering technology can be with linear model come approximate description.Therefore linear model becomes
For one of model of being most widely used in modern statistics.
The linear model that the present invention adopts includes:
LASSO
Linear innovatory algorithm, essence is also one kind of linear classifier, but it is integrated with feature selection and regularization
Function;Improve accuracy rate and the interpretability of statistical model.
Liblinear
Algorithm simple and efficiently, apply in practice widely, quickly, big data quantity can be carried;Can be effective
Process continuous Value Data, the feature self-explanatory etc. that discretization is crossed;Liblinear is in the degree of fitting of data and model explanation
Degree can be taken into account, and takes into account to obtain reasonable algorithm.
Linear-SVM
It is not using kernel matrix, so it is quick more a lot of than LIBSVM;If training set has done big measure feature
Engineering, dimension is very high, more suitable with linear-SVM, also reduces over-fitting risk simultaneously.
2. decision-tree model:
Decision tree (Decision Tree) is on the basis of known various situation probability of happening, by constituting decision tree
Expected value to ask for net present value (NPV) is more than or equal to zero probability, assessment item risk, judges the method for decision analysis of its feasibility,
It is a kind of diagram method intuitively using probability analyses.Because this decision branch is drawn as the branch like one tree for the figure, therefore claim
Decision tree.In machine learning, decision tree is a forecast model, and what he represented is the one kind between object properties and object value
Mapping relations.
We make use of the Boosting model in decision-tree model;This model is during training objective function to instruction
Practice the Taylor expansion that second order has been done in loss, and add canonical item constraint outside object function and just can integrally seek optimal solution;
XGBoost also has speed fast, and transplantation writes code, gram fault-tolerant advantage less.
Above-described embodiment is the present invention preferably embodiment, but embodiments of the present invention are not subject to above-described embodiment
Limit, other any spirit without departing from the present invention and the change made under principle, modification, replacement, combine, simplify,
All should be equivalent substitute mode, be included within protection scope of the present invention.
Claims (10)
1. a kind of user credit assessment models based on multi-source heterogeneous data, it comprises the following steps:
(1) acquisition of multi-source heterogeneous data and merging;
(2) process of user characteristicses;
(3) training of model.
2. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 1 are it is characterised in that institute
The acquisition stating multi-source heterogeneous data includes:
Using crawler technology, crawl in webpage with user-dependent information;
From providing, the premise that user obtains reference report is to provide appropriate personal essential information to user;
User authorizes the access of the data of the third-party institution;
The merging of described multi-source heterogeneous data includes:
Authorized user message and user are provided with data carries out mailbox number, cell-phone number, the arbitrary of identity card ID are mated;
Mailbox number, user name, user's authorization merging are carried out to the information that crawls on the net.
3. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 1 are it is characterised in that use
The process of family feature includes the process of missing values abnormalities characteristic, category feature discrete codes, temporal aspect depth are excavated, obtain system
Meter property feature.
4. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 1 are it is characterised in that institute
State training package vinculum model training, the decision-tree model training of model.
5. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 1 are it is characterised in that institute
State multi-source heterogeneous data and include the essential information of user, school work information, payment information, social network information, operation information, network
Behavioural information.
6. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 3 are it is characterised in that institute
State the process of missing values abnormalities characteristic to specifically include:
A. miss rate carries out feature filling below 20%, for numeric type feature, fills average, fills out for classification type feature
Fill mode;
B. miss rate carries out discard processing and discrete codes conversion more than 97%, and discard processing is to remove disappearance occupation rate to exceed
97% feature, and in the case that miss rate is a lot, discrete codes are carried out to these features;
C. missing values statistical matrix:By user characteristicses matrix, disappearance be set to 1, do not lack is set to 0.
7. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 3 are it is characterised in that institute
State category feature discrete codes to specifically include:One possible value there is is the feature of N kind situation, be encoded to N number of binary
Feature, these feature mutual exclusions, only one of which activation, makes data become sparse every time.
8. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 3 are it is characterised in that institute
State the excavation of temporal aspect depth to specifically include:
1st, adjacent period is carried out subtracting each other with process, represents the difference conversion of different times or a section;
2nd, adjacent period is divided by process, represents chain rate/slope conversion of different times or a section;
3rd, carry out accumulated process, represent and value changes.
9. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 3 are it is characterised in that institute
State the statistical feature of acquisition to specifically include:The miss rate of counting user information, whether user is block trade record user, user
Active time counts, the user locations rate of change, and statistical method includes global statistics or branch mailbox statistics.
10. a kind of user credit assessment models based on multi-source heterogeneous data according to claim 4 are it is characterised in that institute
State linear model training and include LASSO, Liblinear, Linear-SVM;Decision-tree model training include Boosting,
XGBoost.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610817430.1A CN106408184A (en) | 2016-09-12 | 2016-09-12 | User credit evaluation model based on multi-source heterogeneous data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610817430.1A CN106408184A (en) | 2016-09-12 | 2016-09-12 | User credit evaluation model based on multi-source heterogeneous data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106408184A true CN106408184A (en) | 2017-02-15 |
Family
ID=57999368
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610817430.1A Pending CN106408184A (en) | 2016-09-12 | 2016-09-12 | User credit evaluation model based on multi-source heterogeneous data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106408184A (en) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107133867A (en) * | 2017-04-16 | 2017-09-05 | 信阳师范学院 | Credit method for anti-counterfeit based on SVMs |
CN107423339A (en) * | 2017-04-29 | 2017-12-01 | 天津大学 | Popular microblogging Forecasting Methodology based on extreme Gradient Propulsion and random forest |
CN107730154A (en) * | 2017-11-23 | 2018-02-23 | 安趣盈(上海)投资咨询有限公司 | Based on the parallel air control application method of more machine learning models and system |
CN107729519A (en) * | 2017-10-27 | 2018-02-23 | 上海数据交易中心有限公司 | Appraisal procedure and device, terminal based on multi-source multidimensional data |
CN107798137A (en) * | 2017-11-23 | 2018-03-13 | 霍尔果斯智融未来信息科技有限公司 | A kind of multi-source heterogeneous data fusion architecture system based on additive models |
CN107919016A (en) * | 2017-11-15 | 2018-04-17 | 夏莹杰 | Traffic flow parameter missing complementing method based on multi-source detector data |
CN108009914A (en) * | 2017-12-19 | 2018-05-08 | 马上消费金融股份有限公司 | A kind of assessing credit risks method, system, equipment and computer-readable storage medium |
CN108154430A (en) * | 2017-12-28 | 2018-06-12 | 上海氪信信息技术有限公司 | A kind of credit scoring construction method based on machine learning and big data technology |
CN108280759A (en) * | 2018-01-17 | 2018-07-13 | 深圳市和讯华谷信息技术有限公司 | Air control model optimization method, terminal and computer readable storage medium |
CN108449332A (en) * | 2018-03-09 | 2018-08-24 | 中山大学 | A kind of lightweight Mobile Payment Protocol design method based on double gateways |
CN108648068A (en) * | 2018-05-16 | 2018-10-12 | 长沙农村商业银行股份有限公司 | A kind of assessing credit risks method and system |
CN108846743A (en) * | 2018-06-12 | 2018-11-20 | 北京京东金融科技控股有限公司 | Method and apparatus for generating information |
CN109086290A (en) * | 2018-06-08 | 2018-12-25 | 广东万丈金数信息技术股份有限公司 | Registration information judgment method of authenticity and system based on multi-source data decision tree |
CN109753356A (en) * | 2018-12-25 | 2019-05-14 | 北京友信科技有限公司 | A kind of container resource regulating method, device and computer readable storage medium |
CN109947811A (en) * | 2017-11-29 | 2019-06-28 | 北京京东金融科技控股有限公司 | Generic features library generating method and device, storage medium, electronic equipment |
CN110119511A (en) * | 2019-05-17 | 2019-08-13 | 网易传媒科技(北京)有限公司 | Prediction technique, medium, device and the calculating equipment of article hot spot score |
CN110321342A (en) * | 2019-05-27 | 2019-10-11 | 平安科技(深圳)有限公司 | Business valuation studies method, apparatus and storage medium based on intelligent characteristic selection |
CN110363662A (en) * | 2019-08-19 | 2019-10-22 | 上海理工大学 | A kind of personal credit points-scoring system |
CN110634060A (en) * | 2018-06-21 | 2019-12-31 | 马上消费金融股份有限公司 | User credit risk assessment method, system, device and storage medium |
CN110909040A (en) * | 2019-11-08 | 2020-03-24 | 支付宝(杭州)信息技术有限公司 | Business delivery auxiliary method and device and electronic equipment |
CN111045716A (en) * | 2019-11-04 | 2020-04-21 | 中山大学 | Related patch recommendation method based on heterogeneous data |
WO2021043094A1 (en) * | 2019-09-06 | 2021-03-11 | 平安科技(深圳)有限公司 | Location service-based location identification method and apparatus, device, and storage medium |
-
2016
- 2016-09-12 CN CN201610817430.1A patent/CN106408184A/en active Pending
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107133867A (en) * | 2017-04-16 | 2017-09-05 | 信阳师范学院 | Credit method for anti-counterfeit based on SVMs |
CN107423339A (en) * | 2017-04-29 | 2017-12-01 | 天津大学 | Popular microblogging Forecasting Methodology based on extreme Gradient Propulsion and random forest |
CN107729519A (en) * | 2017-10-27 | 2018-02-23 | 上海数据交易中心有限公司 | Appraisal procedure and device, terminal based on multi-source multidimensional data |
CN107729519B (en) * | 2017-10-27 | 2020-06-09 | 上海数据交易中心有限公司 | Multi-source multi-dimensional data-based evaluation method and device, and terminal |
CN107919016B (en) * | 2017-11-15 | 2020-02-18 | 杭州远眺科技有限公司 | Traffic flow parameter missing filling method based on multi-source detector data |
CN107919016A (en) * | 2017-11-15 | 2018-04-17 | 夏莹杰 | Traffic flow parameter missing complementing method based on multi-source detector data |
CN107798137B (en) * | 2017-11-23 | 2018-12-18 | 霍尔果斯智融未来信息科技有限公司 | A kind of multi-source heterogeneous data fusion architecture system based on additive models |
CN107798137A (en) * | 2017-11-23 | 2018-03-13 | 霍尔果斯智融未来信息科技有限公司 | A kind of multi-source heterogeneous data fusion architecture system based on additive models |
CN107730154A (en) * | 2017-11-23 | 2018-02-23 | 安趣盈(上海)投资咨询有限公司 | Based on the parallel air control application method of more machine learning models and system |
CN109947811A (en) * | 2017-11-29 | 2019-06-28 | 北京京东金融科技控股有限公司 | Generic features library generating method and device, storage medium, electronic equipment |
CN108009914A (en) * | 2017-12-19 | 2018-05-08 | 马上消费金融股份有限公司 | A kind of assessing credit risks method, system, equipment and computer-readable storage medium |
CN108154430A (en) * | 2017-12-28 | 2018-06-12 | 上海氪信信息技术有限公司 | A kind of credit scoring construction method based on machine learning and big data technology |
CN108280759A (en) * | 2018-01-17 | 2018-07-13 | 深圳市和讯华谷信息技术有限公司 | Air control model optimization method, terminal and computer readable storage medium |
CN108449332A (en) * | 2018-03-09 | 2018-08-24 | 中山大学 | A kind of lightweight Mobile Payment Protocol design method based on double gateways |
CN108648068A (en) * | 2018-05-16 | 2018-10-12 | 长沙农村商业银行股份有限公司 | A kind of assessing credit risks method and system |
CN109086290A (en) * | 2018-06-08 | 2018-12-25 | 广东万丈金数信息技术股份有限公司 | Registration information judgment method of authenticity and system based on multi-source data decision tree |
CN108846743A (en) * | 2018-06-12 | 2018-11-20 | 北京京东金融科技控股有限公司 | Method and apparatus for generating information |
CN110634060A (en) * | 2018-06-21 | 2019-12-31 | 马上消费金融股份有限公司 | User credit risk assessment method, system, device and storage medium |
CN109753356A (en) * | 2018-12-25 | 2019-05-14 | 北京友信科技有限公司 | A kind of container resource regulating method, device and computer readable storage medium |
CN110119511A (en) * | 2019-05-17 | 2019-08-13 | 网易传媒科技(北京)有限公司 | Prediction technique, medium, device and the calculating equipment of article hot spot score |
CN110119511B (en) * | 2019-05-17 | 2023-05-02 | 网易传媒科技(北京)有限公司 | Article hotspot score prediction method, medium, device and computing equipment |
CN110321342A (en) * | 2019-05-27 | 2019-10-11 | 平安科技(深圳)有限公司 | Business valuation studies method, apparatus and storage medium based on intelligent characteristic selection |
CN110363662A (en) * | 2019-08-19 | 2019-10-22 | 上海理工大学 | A kind of personal credit points-scoring system |
WO2021043094A1 (en) * | 2019-09-06 | 2021-03-11 | 平安科技(深圳)有限公司 | Location service-based location identification method and apparatus, device, and storage medium |
CN111045716A (en) * | 2019-11-04 | 2020-04-21 | 中山大学 | Related patch recommendation method based on heterogeneous data |
CN111045716B (en) * | 2019-11-04 | 2022-02-22 | 中山大学 | Related patch recommendation method based on heterogeneous data |
CN110909040A (en) * | 2019-11-08 | 2020-03-24 | 支付宝(杭州)信息技术有限公司 | Business delivery auxiliary method and device and electronic equipment |
CN110909040B (en) * | 2019-11-08 | 2022-03-04 | 支付宝(杭州)信息技术有限公司 | Business delivery auxiliary method and device and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106408184A (en) | User credit evaluation model based on multi-source heterogeneous data | |
US20200192894A1 (en) | System and method for using data incident based modeling and prediction | |
US20210042767A1 (en) | Digital content prioritization to accelerate hyper-targeting | |
CN107818344A (en) | The method and system that user behavior is classified and predicted | |
CN109657918A (en) | Method for prewarning risk, device and the computer equipment of association assessment object | |
CN106600052A (en) | User attribute and social network detection system based on space-time locus | |
CN106897930A (en) | A kind of method and device of credit evaluation | |
CN101493913A (en) | Method and system for assessing user credit in internet | |
CN107704512A (en) | Financial product based on social data recommends method, electronic installation and medium | |
CN110310163A (en) | A kind of accurate method, equipment and readable medium for formulating marketing strategy | |
US20170032270A1 (en) | Method for predicting personality trait and device therefor | |
US11538044B2 (en) | System and method for generation of case-based data for training machine learning classifiers | |
CN108021651A (en) | Network public opinion risk assessment method and device | |
US11494850B1 (en) | Applied artificial intelligence technology for detecting anomalies in payroll data | |
Naganathan | Comparative analysis of Big data, Big data analytics: Challenges and trends | |
Zou et al. | A novel network security algorithm based on improved support vector machine from smart city perspective | |
Kumar et al. | The zeitgeist juncture of “big data” and its future trends | |
Zaarour | Financial statements earnings manipulation detection using a layer of machine learning | |
Ait et al. | An empirical study on the survival rate of GitHub projects | |
Qudsi et al. | Predictive data mining of chronic diseases using decision tree: a case study of health insurance company in Indonesia | |
Mai et al. | Detecting the intellectual pathway of resilience thinking in urban and regional studies: A critical reflection on resilience literature | |
CN107025494A (en) | Data predication method, financing recommendation method, device and terminal device | |
Sharma et al. | Importance of Big Data in financial fraud detection | |
Rajaleximi et al. | Feature selection using optimized multiple rank score model for credit scoring | |
CN115358878A (en) | Financing user risk preference level analysis method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170215 |