CN110675241A - Label calibration system and method - Google Patents

Label calibration system and method Download PDF

Info

Publication number
CN110675241A
CN110675241A CN201910751423.XA CN201910751423A CN110675241A CN 110675241 A CN110675241 A CN 110675241A CN 201910751423 A CN201910751423 A CN 201910751423A CN 110675241 A CN110675241 A CN 110675241A
Authority
CN
China
Prior art keywords
label
user
financial
model
establishing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910751423.XA
Other languages
Chinese (zh)
Inventor
林逸飞
黄向前
赵音龙
林三吉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Xinyan Artificial Intelligence Technology Co Ltd
Original Assignee
Shanghai Xinyan Artificial Intelligence Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Xinyan Artificial Intelligence Technology Co Ltd filed Critical Shanghai Xinyan Artificial Intelligence Technology Co Ltd
Priority to CN201910751423.XA priority Critical patent/CN110675241A/en
Publication of CN110675241A publication Critical patent/CN110675241A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • General Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Mathematical Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Game Theory and Decision Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Algebra (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Technology Law (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention discloses a label calibration system and a method, comprising the following steps: the authorization processing module is used for authorizing and processing the financial historical data of the user; the analysis module is used for screening financial keywords from the authorized financial historical data and carrying out statistical analysis on the screened financial keywords; the basic label establishing module is used for establishing a plurality of dimensions and corresponding financial keywords in each dimension based on the financial keywords, and establishing a basic label system for the financial keywords in the plurality of dimensions in a permutation and combination mode; the model label establishing module is used for establishing a model label of the user through a preset judgment rule and a preset judgment model; the prediction label establishing module is used for predicting the prediction label of the user by utilizing the scoring card model established by the deep learning technology and the model label of the user.

Description

Label calibration system and method
Technical Field
The invention relates to the technical field of big data mining, in particular to a label calibration system and a label calibration method.
Background
With the rapid development and popularization of the internet, information is explosively increased, so that a large amount of information is accumulated on the internet, and meanwhile, internet users are not only internet content browsers but also create various information on the internet, so that the internet information forms are diversified, and great difficulty is caused to information screening. Therefore, each large internet company has established an inquiry label system based on its own data characteristics, such as broad bean reading, internet music, etc., so as to facilitate users to perform screening inquiry and the like.
Similarly, in the field of internet financial wind control, each internet financial company only establishes tags based on own data for wind control, but due to the characteristics of sensitivity and confidentiality of internet financial data and high liquidity of internet users, the established tags are not perfect, tag limitation is obvious, and the established tag system is difficult to achieve the expected effect of wind control.
The existing label system of each internet financial company has the following defects:
the foundation is not sound: due to the fact that basic data of the label system is limited, the established label system cannot comprehensively describe the user, and therefore an error wind control strategy is obtained.
Service limitation: due to the single service attribute of the basic data, the established label system has certain limitation and cannot be suitable for other service scenes.
Technical limitations: due to insufficient enrichment of basic data, the training effect of the established label system is difficult to achieve in the process of establishing a model by deep learning and artificial intelligence technology.
Disclosure of Invention
The invention provides a label calibration system and a label calibration method aiming at the problems and the defects in the prior art.
The invention solves the technical problems through the following technical scheme:
the invention provides a label calibration system which is characterized by comprising an authorization processing module, an analysis module, a basic label establishing module, a model label establishing module and a prediction label establishing module;
the authorization processing module is used for authorizing and processing financial historical data of a user, wherein the financial historical data comprises loan data, financial data, card verification data and product query data;
the analysis module is used for screening financial keywords from the authorized financial historical data and carrying out statistical analysis on the screened financial keywords;
the basic label establishing module is used for establishing a plurality of dimensions and corresponding financial keywords in each dimension based on the financial keywords, and establishing a basic label system for the financial keywords in the plurality of dimensions in a permutation and combination mode;
the model label establishing module is used for establishing a model label to which a user belongs according to a preset judgment rule and a preset judgment model;
the prediction label establishing module is used for predicting the prediction label of the user by utilizing the scoring card model established by the deep learning technology and the model label of the user.
Preferably, the model tags include tags for multiple points, common debts, suspected debit and return, suspected credit cards, frequent cards, product preferences, and user lifecycle.
Preferably, the predictive tags include tags of default probability, attrition probability, high potential user identification, high value public user identification, fraudulent user identification, and the like.
Preferably, the decision rules and the decision model are modeled by logistic regression, random forest and XGBoost.
The invention also provides a label calibration method, which is characterized by comprising the following steps:
authorizing and processing financial history data of a user, wherein the financial history data comprises user loan data, user financing data, user card verification data and user product query data;
screening financial keywords from the financial historical data subjected to authorization processing, and performing statistical analysis on the screened financial keywords;
establishing a plurality of dimensions and financial keywords corresponding to each dimension based on the financial keywords, and establishing a basic label system for the financial keywords in the plurality of dimensions in a permutation and combination mode;
establishing a model label to which a user belongs through a preset judgment rule and a preset judgment model;
and predicting the prediction label of the user by using the scoring card model established by the deep learning technology and the model label of the user.
Preferably, the model tags include tags for multiple points, common debts, suspected debit and return, suspected credit cards, frequent cards, product preferences, and user lifecycle.
Preferably, the predictive tags include tags of default probability, attrition probability, high potential user identification, high value public user identification, fraudulent user identification, and the like.
Preferably, the decision rules and the decision model are modeled by logistic regression, random forest and XGBoost.
On the basis of the common knowledge in the field, the above preferred conditions can be combined randomly to obtain the preferred embodiments of the invention.
The positive progress effects of the invention are as follows:
the coverage is wide: the existing financial company label system has a single data base, so that the label coverage is relatively narrow, and the label system is based on a wider data base, so that the label coverage is wider and more comprehensive.
The flexibility is strong: the existing label system of the internet financial company has a single service scene, is only suitable for the service scene of the company, but is suitable for various service scenes, and has high flexibility and applicability.
The technical performance is strong: model labels and prediction labels in the existing company label system are mainly based on the traditional machine learning technology, but the system establishes deeper labels by applying the technologies of deep learning, artificial intelligence and the like, and has stronger technical performance.
Drawings
Fig. 1 is a block diagram of a label calibration system according to a preferred embodiment of the present invention.
Fig. 2 is a schematic structural diagram of a tag according to a preferred embodiment of the invention.
FIG. 3 is a flowchart illustrating a label calibration method according to a preferred embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
As shown in fig. 1 and fig. 2, the present embodiment provides a label calibration system, which includes an authorization processing module 1, an analysis module 2, a basic label establishing module 3, a model label establishing module 4, and a predicted label establishing module 5.
The authorization processing module 1 is used for authorizing and processing financial history data of a user, wherein the financial history data comprises loan data, financial management data, card verification data and product query data.
The analysis module 2 is used for screening financial keywords from the authorized financial historical data and performing statistical analysis on the screened financial keywords.
The basic tag establishing module 3 is used for establishing a plurality of dimensions and financial keywords corresponding to each dimension based on the financial keywords, and establishing a basic tag system for the financial keywords in the plurality of dimensions in a permutation and combination mode.
Dimension: the dimensions of the debit and credit basic label, the financing label, the card verification authentication label, the data product query label, the natural attribute of the user and the like account for more than 60 ten thousand labels.
The statistical method comprises the following steps: cumulative (sum), maximum, minimum, average, variance, proportion, distance, trend, and the like.
Scene: the system covers a plurality of business scenes such as ordinary financial cash credit, credit card compensation, automobile finance, consumption staging, financing and the like.
For example: the user lending label is formed by combining 7 dimensions into a plurality of 43 million user lending labels, covers each service scene of the user and is suitable for a wind control model of a plurality of service scenes.
The model label establishing module 4 is used for establishing a model label to which the user belongs according to a preset judgment rule and a preset judgment model.
The model labels comprise labels of multiple heads, common debts, suspected borrowing and returning old, suspected credit cards, common cards, product preferences, user life cycles and the like. The modeling methods of the judgment rules and the judgment models comprise logistic regression, random forests, XGboost and the like.
For example:
common bond determination rule: the user judges that debt sharing users exist in a plurality of merchants at the same time when outstanding loans exist.
Suspected old-new borrowing and old-returning judgment rule: the user is judged to be a suspected new or old user when another loan is generated on another platform within 3 days before the loan of a normal repayment is due.
The prediction label establishing module 5 is used for predicting the prediction label of the user by utilizing the scoring card model established by the deep learning technology and the model label of the user.
The prediction labels comprise labels such as default probability, attrition probability, high potential user identification, high value public user identification and fraud user identification.
For example: and (4) dividing the default rate of the user into a pre-credit application radar scoring card, a medium-credit behavior radar scoring card and the like according to the user behavior through a deep learning technology.
Basic label: the basic label system established by the 7-dimension combination method is wide in coverage area, suitable for multiple service scenes and high in flexibility, and suitable labels can be selected for modeling according to different service scenes.
Model labeling: through the innovation of the rule + model, model labels with stronger business performance, such as multi-head labels, common debt labels, suspected old and new borrowing labels, common card labels and the like, are established, and the labels are more convenient for Internet financial companies to establish wind control strategies.
And (3) predicting a label: different from the traditional scoring card, the scoring card model established by the deep learning technology is more accurate, and the user discrimination is higher.
As shown in fig. 3, the embodiment further provides a label calibration method, which includes the following steps:
step 101, authorizing and processing financial historical data of a user, wherein the financial historical data comprises loan data, financial data, card verification data and product query data.
And 102, screening financial keywords from the authorized financial historical data, and performing statistical analysis on the screened financial keywords.
Step 103, establishing a plurality of dimensions and financial keywords corresponding to each dimension based on the financial keywords, and establishing a basic label system for the financial keywords in the plurality of dimensions in a permutation and combination manner.
Step 104, establishing a model label to which the user belongs through a preset judgment rule and a preset judgment model; the model labels comprise labels of multiple heads, common debts, suspected borrowing and returning old, suspected credit cards, common cards, product preferences, user life cycles and the like, and the decision rules and the decision model modeling method comprise logistic regression, random forests and XGboost.
And 105, predicting the prediction label of the user by using the scoring card model established by the deep learning technology and the model label of the user. The prediction labels comprise labels such as default probability, attrition probability, high potential user identification, high value public user identification and fraud user identification.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that these are by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.

Claims (8)

1. A label calibration system is characterized by comprising an authorization processing module, an analysis module, a basic label establishing module, a model label establishing module and a prediction label establishing module;
the authorization processing module is used for authorizing and processing financial historical data of a user, wherein the financial historical data comprises user loan data, user financial management data, user card verification data and user product query data;
the analysis module is used for screening financial keywords from the authorized financial historical data and carrying out statistical analysis on the screened financial keywords;
the basic label establishing module is used for establishing a plurality of dimensions and corresponding financial keywords in each dimension based on the financial keywords, and establishing a basic label system for the financial keywords in the plurality of dimensions in a permutation and combination mode;
the model label establishing module is used for establishing a model label to which a user belongs according to a preset judgment rule and a preset judgment model;
the prediction label establishing module is used for predicting the prediction label of the user by utilizing the scoring card model established by the deep learning technology and the model label of the user.
2. The label targeting system as in claim 1, wherein said model labels comprise labels of multiple heads, debts, suspected debt, suspected credit card, frequent card, product preferences, and user lifecycle.
3. The tag calibration system of claim 1, wherein the predictive tags include tags of default probability, attrition probability, high potential user identification, high value public user identification, fraudulent user identification, and the like.
4. The label mapping system of claim 1, wherein the decision rules and decision models are modeled using logistic regression, random forest, and XGBoost.
5. A label calibration method is characterized by comprising the following steps:
authorizing and processing financial history data of a user, wherein the financial history data comprises user loan data, user financing data, user card verification data and user product query data;
screening financial keywords from the financial historical data subjected to authorization processing, and performing statistical analysis on the screened financial keywords;
establishing a plurality of dimensions and financial keywords corresponding to each dimension based on the financial keywords, and establishing a basic label system for the financial keywords in the plurality of dimensions in a permutation and combination mode;
establishing a model label to which a user belongs through a preset judgment rule and a preset judgment model;
and predicting the prediction label of the user by using the scoring card model established by the deep learning technology and the model label of the user.
6. The label calibration method as claimed in claim 5, wherein the model label includes labels of multiple heads, common debt, suspected of being borrowed and returned, suspected of being credit card, common card, product preference, and user life cycle.
7. The label calibration method as claimed in claim 5, wherein the predictive label comprises labels of default probability, attrition probability, high potential user identification, high value public user identification, fraudulent user identification, etc.
8. The label calibration method as claimed in claim 5, wherein the decision rules and the decision models are modeled by logistic regression, random forest and XGboost.
CN201910751423.XA 2019-08-15 2019-08-15 Label calibration system and method Pending CN110675241A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910751423.XA CN110675241A (en) 2019-08-15 2019-08-15 Label calibration system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910751423.XA CN110675241A (en) 2019-08-15 2019-08-15 Label calibration system and method

Publications (1)

Publication Number Publication Date
CN110675241A true CN110675241A (en) 2020-01-10

Family

ID=69075323

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910751423.XA Pending CN110675241A (en) 2019-08-15 2019-08-15 Label calibration system and method

Country Status (1)

Country Link
CN (1) CN110675241A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034658A (en) * 2018-08-22 2018-12-18 重庆邮电大学 A kind of promise breaking consumer's risk prediction technique based on big data finance
CN109255506A (en) * 2018-11-22 2019-01-22 重庆邮电大学 A kind of internet finance user's overdue loan prediction technique based on big data
CN109325639A (en) * 2018-12-06 2019-02-12 南京安讯科技有限责任公司 A kind of credit scoring card automation branch mailbox method for credit forecast assessment
CN109426861A (en) * 2017-08-16 2019-03-05 阿里巴巴集团控股有限公司 Data encryption, machine learning model training method, device and electronic equipment
CN109543178A (en) * 2018-11-01 2019-03-29 银江股份有限公司 A kind of judicial style label system construction method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109426861A (en) * 2017-08-16 2019-03-05 阿里巴巴集团控股有限公司 Data encryption, machine learning model training method, device and electronic equipment
CN109034658A (en) * 2018-08-22 2018-12-18 重庆邮电大学 A kind of promise breaking consumer's risk prediction technique based on big data finance
CN109543178A (en) * 2018-11-01 2019-03-29 银江股份有限公司 A kind of judicial style label system construction method and system
CN109255506A (en) * 2018-11-22 2019-01-22 重庆邮电大学 A kind of internet finance user's overdue loan prediction technique based on big data
CN109325639A (en) * 2018-12-06 2019-02-12 南京安讯科技有限责任公司 A kind of credit scoring card automation branch mailbox method for credit forecast assessment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
田江 等: "零售银行数据价值驱动模型研究与应用", 《电子科学技术》 *
田江 等: "零售银行数据价值驱动模型研究与应用", 《电子科学技术》, 30 November 2016 (2016-11-30), pages 757 - 764 *

Similar Documents

Publication Publication Date Title
Koh et al. A two-step method to construct credit scoring models with data mining techniques
US9251541B2 (en) System and method for automated detection of never-pay data sets
Dixon The new geography of capitalism: firms, finance, and society
AU2014202660B2 (en) A system and method using multi-dimensional rating to determine an entity's future commercial viability
CN111325248A (en) Method and system for reducing pre-loan business risk
CN111708883A (en) Credit credit limit determination method and device based on machine learning and equipment fingerprint
WO2019021312A1 (en) An automated system for default probability prediction of loans and method thereof
CN112163944B (en) Loan qualification scoring method and device for clients, computer equipment and storage medium
CN108492001A (en) A method of being used for guaranteed loan network risk management
CN109325845A (en) A kind of financial product intelligent recommendation method and system
Rao et al. Credit risk assessment mechanism of personal auto loan based on PSO-XGBoost Model
CN117252689B (en) Agricultural user credit decision support method and system based on big data
CN113222732A (en) Information processing method, device, equipment and storage medium
Dimitras et al. Evaluation of empirical attributes for credit risk forecasting from numerical data
CN112347392A (en) Anti-fraud assessment method and device based on transfer learning and electronic equipment
Islam et al. Application of artificial intelligence (artificial neural network) to assess credit risk: a predictive model for credit card scoring
CN110675241A (en) Label calibration system and method
CN110852080B (en) Order address identification method, system, equipment and storage medium
CN106682985A (en) Financial fraud identification method and system thereof
Zakowska A New Credit Scoring Model to Reduce Potential Predatory Lending: A Design Science Approach
Biglaiser et al. Domestic political unrest and Chinese overseas foreign direct investment
Alamsyah et al. A stacking ensemble model with SMOTE for improved imbalanced classification on credit data
Gautam Applications of AI and Machine Learning in Banking Solutions.
Chen et al. Construction of Bank Credit White List Access System Based on Grey Clustering Algorithm
Yang Security Evaluation of Financial and Insurance and Ruin Probability Analysis Integrating Deep Learning Models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200110

RJ01 Rejection of invention patent application after publication