CN110675241A - Label calibration system and method - Google Patents
Label calibration system and method Download PDFInfo
- Publication number
- CN110675241A CN110675241A CN201910751423.XA CN201910751423A CN110675241A CN 110675241 A CN110675241 A CN 110675241A CN 201910751423 A CN201910751423 A CN 201910751423A CN 110675241 A CN110675241 A CN 110675241A
- Authority
- CN
- China
- Prior art keywords
- label
- user
- financial
- model
- establishing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 16
- 238000012545 processing Methods 0.000 claims abstract description 16
- 238000005516 engineering process Methods 0.000 claims abstract description 12
- 238000013135 deep learning Methods 0.000 claims abstract description 11
- 238000013475 authorization Methods 0.000 claims abstract description 9
- 238000012216 screening Methods 0.000 claims abstract description 9
- 238000007619 statistical method Methods 0.000 claims abstract description 8
- 238000012795 verification Methods 0.000 claims description 7
- 238000007477 logistic regression Methods 0.000 claims description 6
- 238000007637 random forest analysis Methods 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims 1
- 230000008685 targeting Effects 0.000 claims 1
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 238000011217 control strategy Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 240000006677 Vicia faba Species 0.000 description 1
- 235000010749 Vicia faba Nutrition 0.000 description 1
- 235000002098 Vicia faba var. major Nutrition 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000003631 expected effect Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/03—Credit; Loans; Processing thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- General Physics & Mathematics (AREA)
- Development Economics (AREA)
- Strategic Management (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- Marketing (AREA)
- Mathematical Analysis (AREA)
- General Business, Economics & Management (AREA)
- Mathematical Physics (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Optimization (AREA)
- Computational Mathematics (AREA)
- Game Theory and Decision Science (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Operations Research (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Algebra (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Technology Law (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
The invention discloses a label calibration system and a method, comprising the following steps: the authorization processing module is used for authorizing and processing the financial historical data of the user; the analysis module is used for screening financial keywords from the authorized financial historical data and carrying out statistical analysis on the screened financial keywords; the basic label establishing module is used for establishing a plurality of dimensions and corresponding financial keywords in each dimension based on the financial keywords, and establishing a basic label system for the financial keywords in the plurality of dimensions in a permutation and combination mode; the model label establishing module is used for establishing a model label of the user through a preset judgment rule and a preset judgment model; the prediction label establishing module is used for predicting the prediction label of the user by utilizing the scoring card model established by the deep learning technology and the model label of the user.
Description
Technical Field
The invention relates to the technical field of big data mining, in particular to a label calibration system and a label calibration method.
Background
With the rapid development and popularization of the internet, information is explosively increased, so that a large amount of information is accumulated on the internet, and meanwhile, internet users are not only internet content browsers but also create various information on the internet, so that the internet information forms are diversified, and great difficulty is caused to information screening. Therefore, each large internet company has established an inquiry label system based on its own data characteristics, such as broad bean reading, internet music, etc., so as to facilitate users to perform screening inquiry and the like.
Similarly, in the field of internet financial wind control, each internet financial company only establishes tags based on own data for wind control, but due to the characteristics of sensitivity and confidentiality of internet financial data and high liquidity of internet users, the established tags are not perfect, tag limitation is obvious, and the established tag system is difficult to achieve the expected effect of wind control.
The existing label system of each internet financial company has the following defects:
the foundation is not sound: due to the fact that basic data of the label system is limited, the established label system cannot comprehensively describe the user, and therefore an error wind control strategy is obtained.
Service limitation: due to the single service attribute of the basic data, the established label system has certain limitation and cannot be suitable for other service scenes.
Technical limitations: due to insufficient enrichment of basic data, the training effect of the established label system is difficult to achieve in the process of establishing a model by deep learning and artificial intelligence technology.
Disclosure of Invention
The invention provides a label calibration system and a label calibration method aiming at the problems and the defects in the prior art.
The invention solves the technical problems through the following technical scheme:
the invention provides a label calibration system which is characterized by comprising an authorization processing module, an analysis module, a basic label establishing module, a model label establishing module and a prediction label establishing module;
the authorization processing module is used for authorizing and processing financial historical data of a user, wherein the financial historical data comprises loan data, financial data, card verification data and product query data;
the analysis module is used for screening financial keywords from the authorized financial historical data and carrying out statistical analysis on the screened financial keywords;
the basic label establishing module is used for establishing a plurality of dimensions and corresponding financial keywords in each dimension based on the financial keywords, and establishing a basic label system for the financial keywords in the plurality of dimensions in a permutation and combination mode;
the model label establishing module is used for establishing a model label to which a user belongs according to a preset judgment rule and a preset judgment model;
the prediction label establishing module is used for predicting the prediction label of the user by utilizing the scoring card model established by the deep learning technology and the model label of the user.
Preferably, the model tags include tags for multiple points, common debts, suspected debit and return, suspected credit cards, frequent cards, product preferences, and user lifecycle.
Preferably, the predictive tags include tags of default probability, attrition probability, high potential user identification, high value public user identification, fraudulent user identification, and the like.
Preferably, the decision rules and the decision model are modeled by logistic regression, random forest and XGBoost.
The invention also provides a label calibration method, which is characterized by comprising the following steps:
authorizing and processing financial history data of a user, wherein the financial history data comprises user loan data, user financing data, user card verification data and user product query data;
screening financial keywords from the financial historical data subjected to authorization processing, and performing statistical analysis on the screened financial keywords;
establishing a plurality of dimensions and financial keywords corresponding to each dimension based on the financial keywords, and establishing a basic label system for the financial keywords in the plurality of dimensions in a permutation and combination mode;
establishing a model label to which a user belongs through a preset judgment rule and a preset judgment model;
and predicting the prediction label of the user by using the scoring card model established by the deep learning technology and the model label of the user.
Preferably, the model tags include tags for multiple points, common debts, suspected debit and return, suspected credit cards, frequent cards, product preferences, and user lifecycle.
Preferably, the predictive tags include tags of default probability, attrition probability, high potential user identification, high value public user identification, fraudulent user identification, and the like.
Preferably, the decision rules and the decision model are modeled by logistic regression, random forest and XGBoost.
On the basis of the common knowledge in the field, the above preferred conditions can be combined randomly to obtain the preferred embodiments of the invention.
The positive progress effects of the invention are as follows:
the coverage is wide: the existing financial company label system has a single data base, so that the label coverage is relatively narrow, and the label system is based on a wider data base, so that the label coverage is wider and more comprehensive.
The flexibility is strong: the existing label system of the internet financial company has a single service scene, is only suitable for the service scene of the company, but is suitable for various service scenes, and has high flexibility and applicability.
The technical performance is strong: model labels and prediction labels in the existing company label system are mainly based on the traditional machine learning technology, but the system establishes deeper labels by applying the technologies of deep learning, artificial intelligence and the like, and has stronger technical performance.
Drawings
Fig. 1 is a block diagram of a label calibration system according to a preferred embodiment of the present invention.
Fig. 2 is a schematic structural diagram of a tag according to a preferred embodiment of the invention.
FIG. 3 is a flowchart illustrating a label calibration method according to a preferred embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
As shown in fig. 1 and fig. 2, the present embodiment provides a label calibration system, which includes an authorization processing module 1, an analysis module 2, a basic label establishing module 3, a model label establishing module 4, and a predicted label establishing module 5.
The authorization processing module 1 is used for authorizing and processing financial history data of a user, wherein the financial history data comprises loan data, financial management data, card verification data and product query data.
The analysis module 2 is used for screening financial keywords from the authorized financial historical data and performing statistical analysis on the screened financial keywords.
The basic tag establishing module 3 is used for establishing a plurality of dimensions and financial keywords corresponding to each dimension based on the financial keywords, and establishing a basic tag system for the financial keywords in the plurality of dimensions in a permutation and combination mode.
Dimension: the dimensions of the debit and credit basic label, the financing label, the card verification authentication label, the data product query label, the natural attribute of the user and the like account for more than 60 ten thousand labels.
The statistical method comprises the following steps: cumulative (sum), maximum, minimum, average, variance, proportion, distance, trend, and the like.
Scene: the system covers a plurality of business scenes such as ordinary financial cash credit, credit card compensation, automobile finance, consumption staging, financing and the like.
For example: the user lending label is formed by combining 7 dimensions into a plurality of 43 million user lending labels, covers each service scene of the user and is suitable for a wind control model of a plurality of service scenes.
The model label establishing module 4 is used for establishing a model label to which the user belongs according to a preset judgment rule and a preset judgment model.
The model labels comprise labels of multiple heads, common debts, suspected borrowing and returning old, suspected credit cards, common cards, product preferences, user life cycles and the like. The modeling methods of the judgment rules and the judgment models comprise logistic regression, random forests, XGboost and the like.
For example:
common bond determination rule: the user judges that debt sharing users exist in a plurality of merchants at the same time when outstanding loans exist.
Suspected old-new borrowing and old-returning judgment rule: the user is judged to be a suspected new or old user when another loan is generated on another platform within 3 days before the loan of a normal repayment is due.
The prediction label establishing module 5 is used for predicting the prediction label of the user by utilizing the scoring card model established by the deep learning technology and the model label of the user.
The prediction labels comprise labels such as default probability, attrition probability, high potential user identification, high value public user identification and fraud user identification.
For example: and (4) dividing the default rate of the user into a pre-credit application radar scoring card, a medium-credit behavior radar scoring card and the like according to the user behavior through a deep learning technology.
Basic label: the basic label system established by the 7-dimension combination method is wide in coverage area, suitable for multiple service scenes and high in flexibility, and suitable labels can be selected for modeling according to different service scenes.
Model labeling: through the innovation of the rule + model, model labels with stronger business performance, such as multi-head labels, common debt labels, suspected old and new borrowing labels, common card labels and the like, are established, and the labels are more convenient for Internet financial companies to establish wind control strategies.
And (3) predicting a label: different from the traditional scoring card, the scoring card model established by the deep learning technology is more accurate, and the user discrimination is higher.
As shown in fig. 3, the embodiment further provides a label calibration method, which includes the following steps:
And 102, screening financial keywords from the authorized financial historical data, and performing statistical analysis on the screened financial keywords.
And 105, predicting the prediction label of the user by using the scoring card model established by the deep learning technology and the model label of the user. The prediction labels comprise labels such as default probability, attrition probability, high potential user identification, high value public user identification and fraud user identification.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that these are by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.
Claims (8)
1. A label calibration system is characterized by comprising an authorization processing module, an analysis module, a basic label establishing module, a model label establishing module and a prediction label establishing module;
the authorization processing module is used for authorizing and processing financial historical data of a user, wherein the financial historical data comprises user loan data, user financial management data, user card verification data and user product query data;
the analysis module is used for screening financial keywords from the authorized financial historical data and carrying out statistical analysis on the screened financial keywords;
the basic label establishing module is used for establishing a plurality of dimensions and corresponding financial keywords in each dimension based on the financial keywords, and establishing a basic label system for the financial keywords in the plurality of dimensions in a permutation and combination mode;
the model label establishing module is used for establishing a model label to which a user belongs according to a preset judgment rule and a preset judgment model;
the prediction label establishing module is used for predicting the prediction label of the user by utilizing the scoring card model established by the deep learning technology and the model label of the user.
2. The label targeting system as in claim 1, wherein said model labels comprise labels of multiple heads, debts, suspected debt, suspected credit card, frequent card, product preferences, and user lifecycle.
3. The tag calibration system of claim 1, wherein the predictive tags include tags of default probability, attrition probability, high potential user identification, high value public user identification, fraudulent user identification, and the like.
4. The label mapping system of claim 1, wherein the decision rules and decision models are modeled using logistic regression, random forest, and XGBoost.
5. A label calibration method is characterized by comprising the following steps:
authorizing and processing financial history data of a user, wherein the financial history data comprises user loan data, user financing data, user card verification data and user product query data;
screening financial keywords from the financial historical data subjected to authorization processing, and performing statistical analysis on the screened financial keywords;
establishing a plurality of dimensions and financial keywords corresponding to each dimension based on the financial keywords, and establishing a basic label system for the financial keywords in the plurality of dimensions in a permutation and combination mode;
establishing a model label to which a user belongs through a preset judgment rule and a preset judgment model;
and predicting the prediction label of the user by using the scoring card model established by the deep learning technology and the model label of the user.
6. The label calibration method as claimed in claim 5, wherein the model label includes labels of multiple heads, common debt, suspected of being borrowed and returned, suspected of being credit card, common card, product preference, and user life cycle.
7. The label calibration method as claimed in claim 5, wherein the predictive label comprises labels of default probability, attrition probability, high potential user identification, high value public user identification, fraudulent user identification, etc.
8. The label calibration method as claimed in claim 5, wherein the decision rules and the decision models are modeled by logistic regression, random forest and XGboost.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910751423.XA CN110675241A (en) | 2019-08-15 | 2019-08-15 | Label calibration system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910751423.XA CN110675241A (en) | 2019-08-15 | 2019-08-15 | Label calibration system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110675241A true CN110675241A (en) | 2020-01-10 |
Family
ID=69075323
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910751423.XA Pending CN110675241A (en) | 2019-08-15 | 2019-08-15 | Label calibration system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110675241A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109034658A (en) * | 2018-08-22 | 2018-12-18 | 重庆邮电大学 | A kind of promise breaking consumer's risk prediction technique based on big data finance |
CN109255506A (en) * | 2018-11-22 | 2019-01-22 | 重庆邮电大学 | A kind of internet finance user's overdue loan prediction technique based on big data |
CN109325639A (en) * | 2018-12-06 | 2019-02-12 | 南京安讯科技有限责任公司 | A kind of credit scoring card automation branch mailbox method for credit forecast assessment |
CN109426861A (en) * | 2017-08-16 | 2019-03-05 | 阿里巴巴集团控股有限公司 | Data encryption, machine learning model training method, device and electronic equipment |
CN109543178A (en) * | 2018-11-01 | 2019-03-29 | 银江股份有限公司 | A kind of judicial style label system construction method and system |
-
2019
- 2019-08-15 CN CN201910751423.XA patent/CN110675241A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109426861A (en) * | 2017-08-16 | 2019-03-05 | 阿里巴巴集团控股有限公司 | Data encryption, machine learning model training method, device and electronic equipment |
CN109034658A (en) * | 2018-08-22 | 2018-12-18 | 重庆邮电大学 | A kind of promise breaking consumer's risk prediction technique based on big data finance |
CN109543178A (en) * | 2018-11-01 | 2019-03-29 | 银江股份有限公司 | A kind of judicial style label system construction method and system |
CN109255506A (en) * | 2018-11-22 | 2019-01-22 | 重庆邮电大学 | A kind of internet finance user's overdue loan prediction technique based on big data |
CN109325639A (en) * | 2018-12-06 | 2019-02-12 | 南京安讯科技有限责任公司 | A kind of credit scoring card automation branch mailbox method for credit forecast assessment |
Non-Patent Citations (2)
Title |
---|
田江 等: "零售银行数据价值驱动模型研究与应用", 《电子科学技术》 * |
田江 等: "零售银行数据价值驱动模型研究与应用", 《电子科学技术》, 30 November 2016 (2016-11-30), pages 757 - 764 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Koh et al. | A two-step method to construct credit scoring models with data mining techniques | |
US9251541B2 (en) | System and method for automated detection of never-pay data sets | |
Dixon | The new geography of capitalism: firms, finance, and society | |
AU2014202660B2 (en) | A system and method using multi-dimensional rating to determine an entity's future commercial viability | |
CN111325248A (en) | Method and system for reducing pre-loan business risk | |
CN111708883A (en) | Credit credit limit determination method and device based on machine learning and equipment fingerprint | |
WO2019021312A1 (en) | An automated system for default probability prediction of loans and method thereof | |
CN112163944B (en) | Loan qualification scoring method and device for clients, computer equipment and storage medium | |
CN108492001A (en) | A method of being used for guaranteed loan network risk management | |
CN109325845A (en) | A kind of financial product intelligent recommendation method and system | |
Rao et al. | Credit risk assessment mechanism of personal auto loan based on PSO-XGBoost Model | |
CN117252689B (en) | Agricultural user credit decision support method and system based on big data | |
CN113222732A (en) | Information processing method, device, equipment and storage medium | |
Dimitras et al. | Evaluation of empirical attributes for credit risk forecasting from numerical data | |
CN112347392A (en) | Anti-fraud assessment method and device based on transfer learning and electronic equipment | |
Islam et al. | Application of artificial intelligence (artificial neural network) to assess credit risk: a predictive model for credit card scoring | |
CN110675241A (en) | Label calibration system and method | |
CN110852080B (en) | Order address identification method, system, equipment and storage medium | |
CN106682985A (en) | Financial fraud identification method and system thereof | |
Zakowska | A New Credit Scoring Model to Reduce Potential Predatory Lending: A Design Science Approach | |
Biglaiser et al. | Domestic political unrest and Chinese overseas foreign direct investment | |
Alamsyah et al. | A stacking ensemble model with SMOTE for improved imbalanced classification on credit data | |
Gautam | Applications of AI and Machine Learning in Banking Solutions. | |
Chen et al. | Construction of Bank Credit White List Access System Based on Grey Clustering Algorithm | |
Yang | Security Evaluation of Financial and Insurance and Ruin Probability Analysis Integrating Deep Learning Models |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200110 |
|
RJ01 | Rejection of invention patent application after publication |