CN108446291A - The real-time methods of marking and points-scoring system of user credit - Google Patents
The real-time methods of marking and points-scoring system of user credit Download PDFInfo
- Publication number
- CN108446291A CN108446291A CN201711444140.8A CN201711444140A CN108446291A CN 108446291 A CN108446291 A CN 108446291A CN 201711444140 A CN201711444140 A CN 201711444140A CN 108446291 A CN108446291 A CN 108446291A
- Authority
- CN
- China
- Prior art keywords
- data
- user
- real
- basic data
- flow computation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Databases & Information Systems (AREA)
- General Business, Economics & Management (AREA)
- Development Economics (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Game Theory and Decision Science (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Technology Law (AREA)
- Educational Administration (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
The present invention proposes a kind of real-time methods of marking of user credit, including:Data acquisition step obtains the basic data of user by internet.Acquired basic data is imported into progress real time data processing in data-flow computation cluster by data mart modeling step.Score step, and processed basic data is imported in one or several Rating Models and is scored, wherein Rating Model is established according to data with existing.Data store and feedback step, and basic data, processed basic data and scoring are saved in database, obtain feedback information.Model evaluation and Optimization Steps evaluate and optimize Rating Model and data-flow computation cluster according to basic data, processed basic data, scoring and the feedback information preserved.Step is updated, the Rating Model and data-flow computation cluster after foundation optimization are updated used Rating Model and data-flow computation cluster.The present invention also proposes a kind of real-time points-scoring system of user credit.
Description
Technical field
The present invention relates to Internet technologies, more specifically to the user's evaluation technology of internet financial field.
Background technology
Internet finance has obtained quick development with the development of Internet technology.It is different from traditional financial industry, mutually
Financial majority operation network on line, and non-at-scene completion.Internet finance is not necessarily to rapid and convenient, whole on-line operation
With the characteristics of in-situ processing, the easy-to-use degree of user is greatly improved, therefore has obtained extensive welcome and rapidly development.
But the internet finance operated on line also faces some difficulties.Air control is the life of finance, different financial forms
Requirement to air control is not quite similar.Traditional financial is more biased to air control under line, by acquire on the spot customer data, credit inquiry,
The information such as Central Bank's reference, educational background are audited.Air control rhythm is slower under the line of transmission, and Review Cycle is long, but for the knowledge of risk
Not and control relatively preferable.In the form of on line occur internet finance, feature be exactly quickly and easily operate, therefore no matter
It is that faster rhythm is all pursued by user or enterprise, the obvious speed of mode audited under traditional line is partially slow, cannot meet interconnection
Net the demand of finance.In the prior art, audit under original line is had been moved to realize on line by part internet financial company, but just
For essence, the pattern audited under line is remained.User is differed only in by internet submission data, after data are collected by enterprise
Manual examination and verification are carried out by batch, then auditing result is fed back by internet.Although this pattern and the audit mode under complete line
Compared to improving audit speed, but still remain apparent defect:
First, auditing result can not be obtained after user's submission data (in other words in very short cycle) in real time online, also
It is to wait for manual examination and verification as a result, the substantial rhythm requirement for not reaching internet finance according to processing batch.
Secondly, it audits essentially identical under the examining content and traditional wire of this audit mode, is only when having compressed audit
Between, there is no big variations for the content audited due to audit time reduction, and the discrimination of potential risk may be dropped
It is low, bring risk hidden danger to enterprise.
Invention content
The present invention proposes that one kind being based on big data, can carry out the technology of credit scoring to user in real time.
An embodiment according to the present invention proposes a kind of real-time methods of marking of user credit, including:
Data acquisition step obtains the basic data of user by internet;
Acquired basic data is imported into progress real time data in data-flow computation cluster and added by data mart modeling step
Work;
Score step, and processed basic data is imported in one or several Rating Models and is scored, wherein scoring
Model is established according to data with existing;
Data store and feedback step, and basic data, processed basic data and scoring are saved in database, obtained
Negate feedforward information;
Model evaluation and Optimization Steps, according to basic data, processed basic data, scoring and the feedback letter preserved
Breath, evaluates and optimizes Rating Model and data-flow computation cluster;
Step is updated, the Rating Model and data-flow computation cluster after foundation optimization are to used Rating Model and data
Stream calculation cluster is updated.
In one embodiment, data acquisition step includes:Obtain the identity information of user;Obtain the basis letter of user
Breath, the identity information according to user obtain the basic information of the user from one or several third parties by internet.
In one embodiment, data mart modeling step includes:The basic data of user is imported in data-flow computation frame,
The data-flow computation frame is Spark data-flow computation frames;According to data classification model, the basic data of user is carried out
Classification, data classification model are corresponding with dimension is calculated;Data-flow computation frame uses corresponding classification to each calculating dimension
Basic data is calculated in real time;It preserves result of calculation and result of calculation is supplied to each Rating Model.
In one embodiment, user's portrait is obtained according to the basic data of user, the base of different attribute during user draws a portrait
Plinth data correspond to different calculating dimensions, are calculated user's portrait of several users according to same calculating dimension, acquisition pair
The subdivision customers data of dimension should be calculated.
In one embodiment, feedback information includes the follow-up practical operation behavior of user.
In one embodiment, model evaluation and Optimization Steps include:User class is assessed and optimization, according to single user's
Basic data, processed basic data, scoring and feedback information, to Rating Model and data-flow computation cluster carry out assessment and
Optimization.Customers' grade assessment and optimization:According to dimension is calculated, basic data, warp according to the user in a subdivision customers
Basic data, scoring and the feedback information of processing, evaluate and optimize Rating Model and data-flow computation cluster.
In one embodiment, after each Rating Model and data-flow computation cluster are optimised, to scoring currently in use
Model and data-flow computation cluster carry out real-time update.
In one embodiment, Rating Model be according to existing data, by logistic regression, random forest, GBDT or
XGBoost is modeled.
In one embodiment, database includes unstructured database Hbase and relevant database Mysql, is used
Data transmission middleware Kafka carries out accessing operation to database.
An embodiment according to the present invention proposes a kind of real-time points-scoring system of user credit, including:Data access mouth,
Data-flow computation cluster, one or several Rating Models, database and model evaluation and optimization device.Data access mouth passes through mutual
Networking obtains the basic data of user.Acquired basic data is imported into data-flow computation cluster, data-flow computation cluster
Carry out real time data processing.Processed basic data is imported in Rating Model and is scored, and wherein Rating Model is according to
There is data foundation.Basic data, processed basic data and scoring are saved in database.Model evaluation and optimization device
Feedback information is obtained, and according to basic data, processed basic data, scoring and the feedback information preserved, to the mould that scores
Type and data-flow computation cluster are evaluated and optimized, according to after optimization Rating Model and data-flow computation cluster to score mould
Type and data-flow computation cluster are updated.
In one embodiment, data access mouth includes data acquisition facility, and data acquisition facility obtains the identity of user
Information and identity information according to user, the basic information of the user is obtained by internet from one or several third parties.
In one embodiment, data-flow computation frame is Spark data-flow computation frames.Data-flow computation frame according to
Data classification model classifies to the basic data of user, and data classification model is corresponding with dimension is calculated, data-flow computation
Frame calculates each calculating dimension using the basic data of corresponding classification in real time, preservation result of calculation and by result of calculation
It is supplied to each Rating Model.
In one embodiment, feedback information includes the follow-up practical operation behavior of user.Model evaluation and optimization device
User's portrait also is obtained according to the basic data of user, the basic data of different attribute corresponds to different calculating dimensions during user draws a portrait
Degree calculates user's portrait of several users according to same calculating dimension, obtains the subdivision client of the corresponding calculating dimension
Group's data.
In one embodiment, the model evaluation and optimization of model evaluation and optimization device progress include:User class is assessed
And optimization, according to basic data, processed basic data, scoring and the feedback information of single user, to Rating Model sum number
It is evaluated and optimized according to stream calculation cluster.Customers' grade assessment and optimization:According to dimension is calculated, according to a subdivision customers
In user basic data, processed basic data, scoring and feedback information, to Rating Model and data-flow computation cluster
It is evaluated and optimized.
In one embodiment, model evaluation and optimization device every time carry out Rating Model and data-flow computation cluster excellent
After change, real-time update is carried out to Rating Model currently in use and data-flow computation cluster.
In one embodiment, Rating Model be according to existing data, by logistic regression, random forest, GBDT or
XGBoost is modeled.Database includes unstructured database Hbase and relevant database Mysql, uses data transmission
Middleware Kafka carries out accessing operation to database.
The real-time methods of marking of user credit proposed by the present invention and the real-time points-scoring system of user credit can pass through interconnection
Net obtains the basic information of user, makes real-time scoring to user from various dimensions using big data technology and data flow technique, comments
Divide and is provided to subsequent processing use.The present invention is also using the follow-up practical operation of user as feedback, to modeling and data flow
It is evaluated and optimized, using Machine self-learning principle score-system is constantly evolved.The present invention can be in internet finance
User's real-time credit and borrow risk management and control strong data theory be provided and support.
Description of the drawings
The above and other feature of the present invention, property and advantage will pass through description with reference to the accompanying drawings and examples
And become apparent, identical reference numeral always shows identical feature in the accompanying drawings, wherein:
Fig. 1 discloses the realization process of the real-time methods of marking of the user credit of an embodiment according to the present invention.
Fig. 2 discloses the structure diagram of the real-time points-scoring system of the user credit of an embodiment according to the present invention.
Specific implementation mode
With the development of big data technology, the more comprehensive information of user can be obtained by big data, from the true of user
It carries out to assess the risk of user, the audit than traditional background information is more efficient.Meanwhile big data technology by
In quick data-handling capacity, the operation and processing of mass data can be completed within a very short time, and it is " real can to meet user
When " demand.Therefore, checking method gradually rises on the line based on big data, becomes the important of internet financial field
Audit means.
The present invention proposes that a kind of real-time methods of marking of user credit, Fig. 1 disclose an embodiment according to the present invention
The realization process of the real-time methods of marking of user credit.As shown in Figure 1, this method includes:
102, data acquisition step obtains the basic data of user by internet.In one embodiment, data acquisition
Step includes:The step of basic information of the step of obtaining the identity information of user and acquisition user.On the basis for obtaining user
In the step of information, the identity information according to user obtains the basis of the user from one or several third parties by internet
Information.The identity information of user is usually voluntarily to be provided by user, and identity information is typically such as identification card number, name, identity
Demonstrate,prove the term of validity, the information of identity card picture.In some embodiments, it is also necessary to which user provides one of cell-phone number as identity information
Part.By means of big data technology, the means of such as crawler technology etc are utilized on the internet, it can be according to the identity of user
The other information of acquisition of information user on the internet, these information are known as the basic information of user.Basic information can come from
One or more third parties, such as:From telecom operators, from banking system, from other internet financial systems, come from
Credit investigation system comes from social software, comes from online trading software, from application of function on line etc..Basic information may include:It is logical
News record and call detailed list, educational background, with the presence or absence of on blacklist, shopping at network behavior, boat trip information, the behavior of network loaning bill, society
The number of handing over the accounts network of personal connections, Web Community's behavior, reference report, refund situation etc..
Acquired basic data imported into data-flow computation cluster and carries out real time data by 104, data mart modeling step
Processing.In one embodiment, data mart modeling step includes following process:
The basic data of user is imported in data-flow computation frame, data-flow computation frame is Spark data-flow computations
Frame.Spark data-flow computation frames are distributed memory Computational frames, and calculating speed is fast, and can be real with data-stream form
The continuous real-time processing of existing data.In one embodiment, the transmission that data are realized using data transmission middleware Kafka, than
It is such as imported data in Spark data-flow computation frames by Kafka middlewares, and by Kafka middlewares by Spark
The operation result of data-flow computation frame is supplied to Rating Model or is saved in database.Kafka middlewares are that height is handled up
The distributed of amount subscribes to message system, subscribes to message with it and realizes that message is shared, which data notice related system receives,
Kafka middlewares are suitable for big data quantity, short delay requirement data transmission.
According to data classification model, classify to the basic data of user, data classification model is opposite with dimension is calculated
It answers.Data classification model is for the basic data of user to be mapped with the required calculating dimension of Rating Model.According to use
The basic data at family obtains user's portrait, and the basic data of different attribute corresponds to different calculating dimensions during user draws a portrait.Citing
For, the basic data of user may include:Gender, age, place city, shelter address, place industry, Business Name, duty
Position, the report of educational background, education background, reference, shopping at network behavioral statistics, network loaning bill behavioral statistics, refund situation, boat trip number
According to, address list and communication, list, social media account, social media network of personal connections, social media be dynamically etc. in detail.Wherein:
Gender and age can be included into primary attribute (the calculating dimension of corresponding age or gender);
Place city and shelter address can be included into Regional Property (the calculating dimension of corresponding region);
Place industry, Business Name, position can be included into working attributes (the calculating dimension of corresponding work);
Educational background and education background can be included into academic attribute (the calculating dimension of corresponding educational background);
Reference report, network loaning bill behavioral statistics, refund situation can be included into reference attribute (the calculating dimension of corresponding reference
Degree);
Shopping at network behavioral statistics and boat trip data can be included into behavior property (the calculating dimension of corresponding behavior);
List, social media account, social media network of personal connections, social media dynamic can be included into society in detail for address list and communication
Attribute of a relation (the calculating dimension of correspondence net).
Data-flow computation frame calculates each calculating dimension using the basic data of corresponding classification in real time.As above
Described, when needing to calculate some calculating dimension, the basic data of corresponding attribute can be selected to carry out operation.
It preserves result of calculation and result of calculation is supplied to each Rating Model.
106, score step, and processed basic data is imported in one or several Rating Models and is scored, wherein
Rating Model is established according to data with existing.In one embodiment, Rating Model be according to existing data in database, by
Logistic regression, random forest, GBDT or XGBoost, which are modeled, to be obtained.Each Rating Model carries out any scoring,
How to score, this is according to strategy decision.The specific type and Rating Model modeling process of strategy and Rating Model do not exist
In the range of the present invention discusses, the present invention is the direct utilization for the Rating Model of modeled completion.
108, data storage and feedback step, database is saved in by basic data, processed basic data and scoring
In, obtain feedback information.In one embodiment, database used in the present invention include unstructured database Hbase and
Relevant database Mysql carries out accessing operation using data transmission middleware Kafka to database.Hbase is distributed column
Formula unstructured database, inquiry velocity is fast, and basic data, processed basic data and scoring are saved in Hbase data
In library, the requirement of real-time query can be met.Mysql is relevant database, and user preserves partial structured configuration information.Feedback
Information refers mainly to the agenda of user.The appraisal result that the methods of marking of user credit is obtained is the basic number according to user
It is whether correct in order to verify " estimated data ", it is also necessary to after acquisition according to " estimated data " made with existing historical data
Continuous real data is verified.Such as in the scoring for user credit, " estimated data " of scoring represents user
Refund wish and loan repayment capacity assessed value, but whether user is really refunded, it is also necessary to according to the practical row of user
To be judged.Therefore, in one embodiment, feedback information includes the follow-up practical operation behavior of user.
110, model evaluation and Optimization Steps;According to the basic data, processed basic data, scoring and anti-preserved
Feedforward information evaluates and optimizes Rating Model and data-flow computation cluster.In one embodiment, to Rating Model sum number
It is evaluated and optimized including two levels according to stream calculation cluster:User class assess and optimization and customers grade assessment and it is excellent
Change.User class assessment and optimization are basic data, processed basic data, scoring and the feedback information according to single user,
Rating Model and data-flow computation cluster are evaluated and optimized.Customers grade assessment and optimization be according to calculate dimension, according to
According to basic data, processed basic data, scoring and the feedback information of the user in a subdivision customers, to Rating Model
It is evaluated and optimized with data-flow computation cluster.So-called customers and its preparation method are as follows:Basic data according to user
User's portrait is obtained, the basic data of different attribute corresponds to different calculating dimensions during user draws a portrait, according to same calculating dimension
User's portrait of several users is calculated, the subdivision customers data of the corresponding calculating dimension are obtained.Front is returned to be lifted
Example, the basic data of user may include:Gender, the age, place city, shelter address, place industry, Business Name,
Position, educational background, education background, reference report, shopping at network behavioral statistics, network loaning bill behavioral statistics, refund situation, boat trip number
According to, address list and communication, list, social media account, social media network of personal connections, social media be dynamically etc. in detail.
Gender and age can be included into primary attribute (the calculating dimension of corresponding age or gender);
Place city and shelter address can be included into Regional Property (the calculating dimension of corresponding region);
Place industry, Business Name, position can be included into working attributes (the calculating dimension of corresponding work);
Educational background and education background can be included into academic attribute (the calculating dimension of corresponding educational background);
Reference report, network loaning bill behavioral statistics, refund situation can be included into reference attribute (the calculating dimension of corresponding reference
Degree);
Shopping at network behavioral statistics and boat trip data can be included into behavior property (the calculating dimension of corresponding behavior);
List, social media account, social media network of personal connections, social media dynamic can be included into society in detail for address list and communication
Attribute of a relation (the calculating dimension of correspondence net).
Dimension is calculated according to the age, can filter out the age falls in a certain range, such as 20-22 Sui young client
Group.Alternatively, combining gender according to the age, 20-22 Sui male's youth customers can be filtered out.
For another example, according to educational background calculate dimension combine the age calculate dimension, 20-22 Sui can be filtered out, with undergraduate course with
The well educated young customers of upper educational background.
By the combination of different calculating dimensions, the subdivision customers with different attribute can be obtained.By each subdivision
The feedback data of customers, can be to Rating Model and data-flow computation cluster compared with scoring obtains before " estimated data "
(mainly data classification model therein) is assessed.The assessment can effectively find Rating Model and data-flow computation cluster
For the inadaptability of specific subdivision customers, such as some subdivision customers, the reality of " estimated data " and feedback
When border data difference is larger, just illustrating Rating Model and data-flow computation cluster, there are blind areas for the subdivision customers, uncomfortable
The characteristics of customers should be segmented.(it is mainly to Rating Model and data-flow computation cluster according to actual feedback later
Data classification model therein) it optimizes, optimization point exists primarily directed to " estimated data " with the real data of feedback
The point of significant difference.The optimization can combine model optimization and strategy, the best applications that analysis Rating Model and strategy combine
Scheme ensures that Rating Model all has preferable validity and stability in each subdivision customers.It is excellent about Rating Model
That changes has scheme, not within the scope of the discussion of the present invention.
112, update step, according to after optimization Rating Model and data-flow computation cluster to used Rating Model and
Data-flow computation cluster is updated.In one embodiment, it is right after each Rating Model and data-flow computation cluster are optimised
Rating Model currently in use and data-flow computation cluster carry out real-time update.Since the present invention is using Data Stream Processing side
Formula, the data of each user for receiving be handle in real time, so, complete certain primary user class assessment and
After optimization or customers' grade assessment and optimization, after Rating Model and data-flow computation cluster optimize, immediately to current
Rating Model currently in use and data-flow computation cluster are updated, in this way, the data of next user can use optimization
Rating Model afterwards and the processing of data-flow computation cluster.
It should be noted that the side that the real-time methods of marking of the user credit of the present invention is handled in real time using data flow
Formula, for the angle of single user, data acquisition step, data mart modeling step, scoring step, data storage and feedback step
Suddenly, model evaluation and Optimization Steps and update step are to execute successively.For the angle of holistic approach, due in synchronization
It has many consumers and is handled synchronizing, the stage residing for each user is different, so from the point of view of holistic approach, data obtain
Take step, data mart modeling step, scoring step, data storage and feedback step, model evaluation and Optimization Steps and update step
It can be alternately performed, or be carried out at the same time.Although therefore being numbered to each step in above description, which is
For convenience, not limit each step executes sequence.
Present invention further teaches a kind of real-time points-scoring systems of user credit, refering to what is shown in Fig. 2, Fig. 2 is disclosed according to this
The structure diagram of the real-time points-scoring system of the user credit of one embodiment of invention.The real-time points-scoring system packet of the user credit
It includes:Data access mouth 202, data-flow computation cluster 204, one or several Rating Models 206, database 208 and model evaluation
And optimization device 210.
Data access mouth 202 obtains the basic data of user by internet.In one embodiment, data access mouth
202 include data acquisition facility, and data acquisition facility obtains the identity information of user and the identity information according to user, by mutual
Networking obtains the basic information of the user from one or several third parties.Data access mouth 202 executes data acquisition step above-mentioned
Rapid 102, detail is not repeated to describe herein.
Acquired basic data is imported into data-flow computation cluster 204, and data-flow computation cluster 204 is counted in real time
According to processing.In one embodiment, data-flow computation frame 204 is Spark data-flow computation frames.Data-flow computation frame
204 classify to the basic data of user according to data classification model, and data classification model is corresponding with dimension is calculated, data
Stream calculation frame calculates each calculating dimension using the basic data of corresponding classification in real time, and preservation result of calculation simultaneously will meter
It calculates result and is supplied to each Rating Model.Data-flow computation cluster 204 executes data mart modeling step 104 above-mentioned, detail
It is not repeated to describe herein.
Processed basic data is imported in one or several Rating Models 206 and is scored, and wherein Rating Model is root
It is established according to data with existing.In one embodiment, Rating Model 206 is according to existing data, by logistic regression, random gloomy
Woods, GBDT or XGBoost are modeled.Rating Model 206 executes scoring step 106 above-mentioned, and detail is no longer heavy herein
Multiple description.
Database 208 is for preserving basic data, processed basic data and scoring.In one embodiment, data
Library includes unstructured database Hbase and relevant database Mysql, using data transmission middleware Kafka to database
Carry out accessing operation.Database 208 executes the data storage link in data storage above-mentioned and feedback step 108, specific thin
Section is not repeated to describe herein.
Model evaluation and optimization device 210 obtain feedback information, and according to preserved basic data, processed basis
Data, scoring and feedback information evaluate and optimize Rating Model and data-flow computation cluster, according to the scoring after optimization
Model and data-flow computation cluster are updated Rating Model and data-flow computation cluster.In one embodiment, feedback letter
Breath includes the follow-up practical operation behavior of user.Model evaluation and optimization device 210 obtain user according to the basic data of user
Portrait, the basic data of different attribute corresponds to different calculating dimensions during user draws a portrait, according to same calculating dimension to several use
User's portrait at family calculates, and obtains the subdivision customers data of the corresponding calculating dimension.Obtain subdivision customers it
Afterwards, the model evaluation and optimization that model evaluation and optimization device 210 carry out include two levels:User class assess and optimization and
Customers' grade assessment and optimization.User class is assessed and is optimized the basic data according to single user, processed basic data, comments
Point and feedback information, Rating Model and data-flow computation cluster are evaluated and optimized.Customers' grade assessment and optimization basis
Dimension is calculated, basic data, processed basic data, scoring and feedback letter according to the user in a subdivision customers
Breath, evaluates and optimizes Rating Model and data-flow computation cluster.In one embodiment, in model evaluation and optimization dress
Set 210 Rating Model and data-flow computation cluster are optimized every time after, to Rating Model currently in use and data flowmeter
It calculates cluster and carries out real-time update.Model evaluation and the feedback element in the optimization execution aforementioned feedback step 108 of device 210, model
Assessment and Optimization Steps 110 and update step 112, detail are not repeated to describe herein.
The real-time methods of marking of user credit proposed by the present invention and the real-time points-scoring system of user credit can pass through interconnection
Net obtains the basic information of user, makes real-time scoring to user from various dimensions using big data technology and data flow technique, comments
Divide and is provided to subsequent processing use.The present invention is also using the follow-up practical operation of user as feedback, to modeling and data flow
It is evaluated and optimized, using Machine self-learning principle score-system is constantly evolved.The present invention can be in internet finance
User's real-time credit and borrow risk management and control strong data theory be provided and support.
Above-described embodiment, which is available to, to be familiar with person in the art to realize or use the present invention, and is familiar with this field
Personnel can make various modifications or variation, thus this to above-described embodiment without departing from the present invention in the case of the inventive idea
The protection domain of invention is not limited by above-described embodiment, and should meet inventive features that claims are mentioned most
On a large scale.
Claims (16)
1. a kind of real-time methods of marking of user credit, which is characterized in that including:
Data acquisition step obtains the basic data of user by internet;
Acquired basic data is imported into progress real time data processing in data-flow computation cluster by data mart modeling step;
Score step, and processed basic data is imported in one or several Rating Models and is scored, wherein Rating Model
It is to be established according to data with existing;
Data store and feedback step, and basic data, processed basic data and scoring are saved in database, obtain anti-
Feedforward information;
Model evaluation and Optimization Steps, according to basic data, processed basic data, scoring and the feedback information preserved,
Rating Model and data-flow computation cluster are evaluated and optimized;
Step is updated, the Rating Model and data-flow computation cluster after foundation optimization are to used Rating Model and data flowmeter
Cluster is calculated to be updated.
2. the real-time methods of marking of user credit as described in claim 1, which is characterized in that the data acquisition step packet
It includes:
Obtain the identity information of user;
The basic information for obtaining user, according to the identity information of user, by internet, being obtained from one or several third parties should
The basic information of user.
3. the real-time methods of marking of user credit as described in claim 1, which is characterized in that the data mart modeling step packet
It includes:
The basic data of user is imported in data-flow computation frame, the data-flow computation frame is Spark data-flow computations
Frame;
According to data classification model, classify to the basic data of user, data classification model is corresponding with dimension is calculated;
Data-flow computation frame calculates each calculating dimension using the basic data of corresponding classification in real time;
It preserves result of calculation and result of calculation is supplied to each Rating Model.
4. the real-time methods of marking of user credit as claimed in claim 3, which is characterized in that the basic data according to user obtains
User's portrait is obtained, the basic data of different attribute corresponds to different calculating dimensions during user draws a portrait, according to same calculating dimension pair
User's portrait of several users calculates, and obtains the subdivision customers data of the corresponding calculating dimension.
5. the real-time methods of marking of user credit as claimed in claim 4, which is characterized in that the feedback information includes user
Follow-up practical operation behavior.
6. the real-time methods of marking of user credit as claimed in claim 5, which is characterized in that the model evaluation and optimization step
Suddenly include:
User class is assessed and optimization, basic data, processed basic data, scoring and the feedback information of foundation single user,
Rating Model and data-flow computation cluster are evaluated and optimized;
Customers' grade assessment and optimization:According to dimension is calculated, according to the basic data of the user in a subdivision customers, through adding
Basic data, scoring and the feedback information of work, evaluate and optimize Rating Model and data-flow computation cluster.
7. the real-time methods of marking of user credit as claimed in claim 6, which is characterized in that each Rating Model and data flow
After computing cluster is optimised, real-time update is carried out to Rating Model currently in use and data-flow computation cluster.
8. the real-time methods of marking of user credit as described in claim 1, which is characterized in that the Rating Model is according to
Some data are modeled by logistic regression, random forest, GBDT or XGBoost.
9. the real-time methods of marking of user credit as described in claim 1, which is characterized in that the database includes non-structural
Change database Hbase and relevant database Mysql, accessing operation is carried out to database using data transmission middleware Kafka.
10. a kind of real-time points-scoring system of user credit, which is characterized in that including:
Data access mouth obtains the basic data of user by internet;
Data-flow computation cluster, acquired basic data are imported into data-flow computation cluster, and data-flow computation cluster carries out
Real time data is processed;
One or several Rating Models, processed basic data are imported in Rating Model and are scored, and wherein Rating Model is
It is established according to data with existing;
Basic data, processed basic data and scoring are saved in database by database;
Model evaluation and optimization device, obtain feedback information, and according to preserved basic data, processed basic data,
Scoring and feedback information, evaluate and optimize Rating Model and data-flow computation cluster, according to the Rating Model after optimization
Rating Model and data-flow computation cluster are updated with data-flow computation cluster.
11. the real-time points-scoring system of user credit as claimed in claim 10, which is characterized in that the data access mouth includes
Data acquisition facility, data acquisition facility obtain the identity information of user and identity information according to user, by internet from
One or several third parties obtain the basic information of the user.
12. the real-time points-scoring system of user credit as claimed in claim 10, which is characterized in that the data-flow computation frame
It is Spark data-flow computation frames;
Data-flow computation frame classifies to the basic data of user according to data classification model, data classification model and calculating
Dimension is corresponding, and data-flow computation frame calculates each calculating dimension using the basic data of corresponding classification in real time, protects
It deposits result of calculation and result of calculation is supplied to each Rating Model.
13. the real-time points-scoring system of user credit as claimed in claim 12, which is characterized in that
The feedback information includes the follow-up practical operation behavior of user;
The model evaluation and optimization device also obtain user's portrait, different attribute during user draws a portrait according to the basic data of user
Basic data correspond to different calculating dimensions, the user of several users portrait is calculated according to same calculating dimension, is obtained
The subdivision customers data of the calculating dimension must be corresponded to.
14. the real-time points-scoring system of user credit as claimed in claim 13, which is characterized in that the model evaluation and optimization
The model evaluation of device progress and optimization include:
User class is assessed and optimization, basic data, processed basic data, scoring and the feedback information of foundation single user,
Rating Model and data-flow computation cluster are evaluated and optimized;
Customers' grade assessment and optimization:According to dimension is calculated, according to the basic data of the user in a subdivision customers, through adding
Basic data, scoring and the feedback information of work, evaluate and optimize Rating Model and data-flow computation cluster.
15. the real-time points-scoring system of user credit as claimed in claim 10, which is characterized in that model evaluation and optimization device
After being optimized every time to Rating Model and data-flow computation cluster, to Rating Model currently in use and data-flow computation cluster
Carry out real-time update.
16. the real-time points-scoring system of user credit as claimed in claim 10, which is characterized in that
The Rating Model is modeled by logistic regression, random forest, GBDT or XGBoost according to existing data;
The database includes unstructured database Hbase and relevant database Mysql, uses data transmission middleware
Kafka carries out accessing operation to database.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711444140.8A CN108446291A (en) | 2017-12-27 | 2017-12-27 | The real-time methods of marking and points-scoring system of user credit |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711444140.8A CN108446291A (en) | 2017-12-27 | 2017-12-27 | The real-time methods of marking and points-scoring system of user credit |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108446291A true CN108446291A (en) | 2018-08-24 |
Family
ID=63190740
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711444140.8A Pending CN108446291A (en) | 2017-12-27 | 2017-12-27 | The real-time methods of marking and points-scoring system of user credit |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108446291A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109360085A (en) * | 2018-09-27 | 2019-02-19 | 中国银行股份有限公司 | A kind of bank client responsible investigation method and system |
CN109472439A (en) * | 2018-09-13 | 2019-03-15 | 深圳市买买提信息科技有限公司 | Credit estimation method, device, equipment and system |
CN109815257A (en) * | 2019-01-16 | 2019-05-28 | 四川驹马科技有限公司 | Scalable real-time High Availabitity portrait algorithm service method and its system |
CN110399988A (en) * | 2019-07-31 | 2019-11-01 | 中国工商银行股份有限公司 | Equipment portrait generation method and system |
CN112084486A (en) * | 2020-09-08 | 2020-12-15 | 中国平安财产保险股份有限公司 | User information verification method and device, electronic equipment and storage medium |
CN112258314A (en) * | 2020-10-19 | 2021-01-22 | 天元大数据信用管理有限公司 | Financial wind-control credit investigation system and method based on flow calculation technology |
CN112347343A (en) * | 2020-09-25 | 2021-02-09 | 北京淇瑀信息科技有限公司 | Customized information pushing method and device and electronic equipment |
CN112446555A (en) * | 2021-01-26 | 2021-03-05 | 支付宝(杭州)信息技术有限公司 | Risk identification method, device and equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1680953A (en) * | 2004-07-05 | 2005-10-12 | 中国银行股份有限公司 | Risk analyzing system and method for customer of financial enterprise |
CN101493913A (en) * | 2008-01-23 | 2009-07-29 | 阿里巴巴集团控股有限公司 | Method and system for assessing user credit in internet |
CN105894336A (en) * | 2016-05-25 | 2016-08-24 | 北京比邻弘科科技有限公司 | Mobile Internet-based big data mining method and system |
CN107194715A (en) * | 2017-04-07 | 2017-09-22 | 广东精点数据科技股份有限公司 | The construction method of social action data model |
CN107330785A (en) * | 2017-07-10 | 2017-11-07 | 广州市触通软件科技股份有限公司 | A kind of petty load system and method based on the intelligent air control of big data |
-
2017
- 2017-12-27 CN CN201711444140.8A patent/CN108446291A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1680953A (en) * | 2004-07-05 | 2005-10-12 | 中国银行股份有限公司 | Risk analyzing system and method for customer of financial enterprise |
CN101493913A (en) * | 2008-01-23 | 2009-07-29 | 阿里巴巴集团控股有限公司 | Method and system for assessing user credit in internet |
CN105894336A (en) * | 2016-05-25 | 2016-08-24 | 北京比邻弘科科技有限公司 | Mobile Internet-based big data mining method and system |
CN107194715A (en) * | 2017-04-07 | 2017-09-22 | 广东精点数据科技股份有限公司 | The construction method of social action data model |
CN107330785A (en) * | 2017-07-10 | 2017-11-07 | 广州市触通软件科技股份有限公司 | A kind of petty load system and method based on the intelligent air control of big data |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109472439A (en) * | 2018-09-13 | 2019-03-15 | 深圳市买买提信息科技有限公司 | Credit estimation method, device, equipment and system |
CN109360085A (en) * | 2018-09-27 | 2019-02-19 | 中国银行股份有限公司 | A kind of bank client responsible investigation method and system |
CN109815257A (en) * | 2019-01-16 | 2019-05-28 | 四川驹马科技有限公司 | Scalable real-time High Availabitity portrait algorithm service method and its system |
CN110399988A (en) * | 2019-07-31 | 2019-11-01 | 中国工商银行股份有限公司 | Equipment portrait generation method and system |
CN112084486A (en) * | 2020-09-08 | 2020-12-15 | 中国平安财产保险股份有限公司 | User information verification method and device, electronic equipment and storage medium |
CN112347343A (en) * | 2020-09-25 | 2021-02-09 | 北京淇瑀信息科技有限公司 | Customized information pushing method and device and electronic equipment |
CN112347343B (en) * | 2020-09-25 | 2024-05-28 | 北京淇瑀信息科技有限公司 | Custom information pushing method and device and electronic equipment |
CN112258314A (en) * | 2020-10-19 | 2021-01-22 | 天元大数据信用管理有限公司 | Financial wind-control credit investigation system and method based on flow calculation technology |
CN112446555A (en) * | 2021-01-26 | 2021-03-05 | 支付宝(杭州)信息技术有限公司 | Risk identification method, device and equipment |
CN112446555B (en) * | 2021-01-26 | 2021-05-25 | 支付宝(杭州)信息技术有限公司 | Risk identification method, device and equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108446291A (en) | The real-time methods of marking and points-scoring system of user credit | |
JP7529372B2 (en) | COMPUTER-IMPLEMENTED SYSTEM AND METHOD FOR GENERATING AND EXTRACTION OF USER-RELATED DATA STORED ON A BLOCKCHAIN | |
Morris et al. | Social value of public information | |
CN111784508A (en) | Enterprise risk assessment method and device and electronic equipment | |
CN109977151A (en) | A kind of data analysing method and system | |
KR20180041174A (en) | Risk Assessment Methods and Systems | |
CN111476660B (en) | Intelligent wind control system and method based on data analysis | |
CN110188198A (en) | A kind of anti-fraud method and device of knowledge based map | |
WO2021254027A1 (en) | Method and apparatus for identifying suspicious community, and storage medium and computer device | |
CN109635007B (en) | Behavior evaluation method and device and related equipment | |
CN105308640A (en) | Methods and systems for automatically generating high quality adverse action notifications | |
CN107038511A (en) | A kind of method and device for determining risk assessment parameter | |
CN108492001A (en) | A method of being used for guaranteed loan network risk management | |
CN110119980A (en) | A kind of anti-fraud method, apparatus, system and recording medium for credit | |
CN107274042A (en) | A kind of business participates in the Risk Identification Method and device of object | |
CN112950350B (en) | Loan product recommendation method and system based on machine learning | |
CN113159930A (en) | Customer group identification method and device based on economic dependency relationship | |
CN114820219B (en) | Complex network-based fraud community identification method and system | |
Zhao et al. | Network-based feature extraction method for fraud detection via label propagation | |
CN109544299A (en) | Buyer's identity ranking method, equipment and the storage medium of platform are ensured based on transaction | |
CN114693428A (en) | Data determination method and device, computer readable storage medium and electronic equipment | |
TWI720638B (en) | Deposit interest rate bargaining adjustment system and method thereof | |
You et al. | Evaluating reputation of internet financial platform: An improved fuzzy evaluation approach | |
CN112785331A (en) | Injection attack resistant robust recommendation method and system combining evaluation text | |
CN110147938A (en) | A kind of training sample generation method, device, system and recording medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180824 |
|
RJ01 | Rejection of invention patent application after publication |