CN114299350A - Artificial credit auditing information recommendation method and system based on machine learning - Google Patents
Artificial credit auditing information recommendation method and system based on machine learning Download PDFInfo
- Publication number
- CN114299350A CN114299350A CN202111536543.1A CN202111536543A CN114299350A CN 114299350 A CN114299350 A CN 114299350A CN 202111536543 A CN202111536543 A CN 202111536543A CN 114299350 A CN114299350 A CN 114299350A
- Authority
- CN
- China
- Prior art keywords
- information
- model
- enterprise
- characteristic
- content
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 20
- 238000010801 machine learning Methods 0.000 title claims abstract description 15
- 239000000463 material Substances 0.000 claims abstract description 47
- 239000013598 vector Substances 0.000 claims abstract description 36
- 238000012545 processing Methods 0.000 claims abstract description 29
- 230000006399 behavior Effects 0.000 claims abstract description 15
- 238000007477 logistic regression Methods 0.000 claims abstract description 12
- 238000012549 training Methods 0.000 claims abstract description 10
- 238000013507 mapping Methods 0.000 claims abstract description 7
- 238000012552 review Methods 0.000 claims description 10
- 238000012216 screening Methods 0.000 claims description 7
- 238000002372 labelling Methods 0.000 claims description 3
- 238000003066 decision tree Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000011273 social behavior Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Landscapes
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
The invention discloses a manual credit auditing information recommendation method and system based on machine learning, and belongs to the technical field of credit auditing. The invention comprises the following steps: 1. making a content material: carrying out standardization processing on data information of an object to be checked; 2. characteristic collection: collecting behavior data, environment data and an audited object feature set of auditors; 3. characteristic processing: processing the features into normalized feature vectors; 4. model training: inputting the standardized feature vector into a combined model formed by a GBDT model and an LR logistic regression model for training to obtain a mapping relation between the feature vector and the content material; 5. model prediction: and after the user information to be predicted is subjected to characteristic processing, inputting the user information into the trained model, and outputting the content material n before the predicted click rate ranking. According to the method and the system, intelligent recommendation is formed by learning the behavior of the examining and approving personnel using the materials, the content preferred by the examining and approving personnel can be screened from the mass material content, and the examining and approving efficiency is improved.
Description
Technical Field
The invention belongs to the technical field of credit auditing, and particularly relates to an artificial credit auditing information recommendation method and system based on machine learning.
Background
At present, the financial technology realizes the online process of applying for loan by a client, some standardized links such as admission condition screening and distribution have realized automatic decision-making based on the quality difference of the client, but for some parts with high uncertainty, the method needs the participation of manual examination and approval personnel, and based on the industry experience, the market preference, the rationality of the fund application of the client by subjective judgment, the repayment source reliability and the like. When evaluating an enterprise, a creditor needs to refer to multidimensional information, such as personal credit history, social behaviors, personal assets and liabilities of the enterprise, famous enterprise financial statements, enterprise associated network information, social public opinion information, market popularity of the enterprise belonging to the industry, market period of the market and the like of an applicant, a large amount of information needs to be searched and referred, sometimes even repeatedly compared, and the question of one information point triggers the search of the next information point. At present, the practice in the market is to input the collected information into a database through a go-approval system, provide a query interface, support the development of a customized report system, and display the information required by a letter-checking person, but the information does not have an intelligent recommendation function.
The query data of the query interface is lack of interactivity, usually only single-point information is returned, multi-point information combination and comparison cannot be achieved, the purpose of decision assistance is difficult to achieve, personnel are often required to manually combine the single-point information, and the time consumption, long acting rate and low efficiency are achieved. The customized report can only meet fixed information display, and the newly increased information development period is long and is not matched with the rapidly changing service requirement.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a manual credit review information recommendation method and system based on machine learning, which aims to: the information retrieval efficiency and the experience of the examination and approval personnel in the examination and approval process are improved, and therefore the examination and approval efficiency is improved.
The technical scheme adopted by the invention is as follows:
a manual credit review information recommendation method based on machine learning comprises the following steps:
step 1: making a content material: carrying out standardization processing on data information of an object to be checked, generating table and graphic materials, and forming corresponding content links;
step 2: characteristic collection: collecting behavior data, environment data and an audited object feature set of auditors;
and step 3: characteristic processing: processing the features collected in the step 2 into standardized feature vectors and marking content preference labels;
and 4, step 4: model training: inputting the standardized feature vector obtained after the processing in the step 3 into a combined model formed by a GBDT model and a logistic regression model for training to form a mapping relation between the feature vector and the content material;
and 5: model prediction: and (3) after the user information to be predicted is subjected to the characteristic processing in the step (3) to obtain a standardized characteristic vector, inputting the standardized characteristic vector into the model trained in the step (4), and outputting content materials of n before the predicted click rate ranking, wherein n is an integer larger than 0.
After the technical scheme is adopted, when credit verification is carried out on a user, client information and associated enterprise information are subjected to standardized processing to be made into table or graphic materials, the system can recommend the materials to an approver, intelligent recommendation is formed by learning the behaviors of the approver in using the materials, such as residence time, click rate, the sequence of checking the materials, subscription information and the like, the content with the largest checking times of the approver is screened out from the mass material content, and the approval efficiency is greatly improved. In addition, the combined model (gradient lifting decision tree + logistic regression) of GBDT + LR is adopted in the training model, and the combined model has the advantages of obtaining the advantages of feature combination, controlling the complexity of the model and reducing the requirement on the operation performance.
Preferably, in step 1, the content material includes applicant-related relationship information, applicant-subject enterprise related information, enterprise-affiliated industry information, and applicant or enterprise-subject internet public opinion information.
Preferably, the applicant-related information includes the age, sex, position, academic history and personal liability size of the applicant; the related information of the applicant main body enterprise comprises the region of the applicant main body enterprise, the earning scale, the staff scale, the tax income and the financial statement; the related enterprise information of the main enterprise comprises the region of the related enterprise, the revenue scale, the staff scale, the tax income and the financial statement.
Preferably, the behavior data of the approver in the step 2 includes click rate, dwell time and click sequence of the approver on the content material; the environment data comprises login time, the current position of the approval node, behavior data of other approvals, approval time, the applicant to be approved and enterprise attributes.
Further preferably, the specific steps of step 5 are: and inputting the standardized feature vector into a GBDT model, obtaining a discrete feature vector by the GBDT model through a method of screening features and combining features, inputting the discrete feature vector into a logistic regression model, predicting the click rate of the user on the content material according to the mapping relation between the discrete feature vector and the content material by the logistic regression model, and screening and outputting the content material n before the click rate is ranked.
The invention also discloses an artificial credit auditing information recommendation system based on machine learning, which comprises the following steps:
an online feature acquisition module: the system is used for acquiring user information to be predicted in real time;
an offline feature collection module: the system comprises a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is used for acquiring behavior data, environment data and an audited object feature set of auditors;
a characteristic processing module: the system comprises an online characteristic acquisition module, an offline characteristic collection module, a standard characteristic vector generation module and a labeling module, wherein the online characteristic acquisition module is used for acquiring information of a user;
a recommendation module: and outputting the material content of the top n of the click rate ranking according to the input standardized feature vector.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
1. the invention standardizes the client information and the associated enterprise information, makes the client information and the associated enterprise information into table or graphic materials, recommends the materials to an approver, forms intelligent recommendation by learning the behavior of the approver using the materials, such as retention time, click rate, sequence for viewing the materials, subscription information and the like, screens out the content with the most viewing times of the approver from the mass material contents, and greatly improves the approval efficiency.
2. The training model of the invention adopts a GBDT + LR combined model (gradient lifting decision tree + logistic regression), obtains the advantages of feature combination, controls the complexity of the model and reduces the requirement on the operation performance.
Drawings
The invention will now be described, by way of example, with reference to the accompanying drawings, in which:
FIG. 1 is a flow chart of the present invention;
fig. 2 is a block diagram of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
In the description of the embodiments of the present application, it should be noted that the terms "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings or orientations or positional relationships that the products of the present invention are usually placed in when used, and are only used for convenience of description and simplicity of description, but do not indicate or imply that the devices or elements that are referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present application. Furthermore, the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.
The present invention will be described in detail with reference to fig. 1 to 2.
A manual credit review information recommendation method based on machine learning comprises the following steps:
step 1: making a content material: carrying out standardization processing on data information of an object to be checked, generating table and graphic materials, and forming corresponding content links;
the materials that the credit auditor needs to view include 1) personal relationship information (age, sex, position, academic calendar, personal asset and debt size, etc.) of the applicant; 2) the related information of the main enterprise of the applicant (the region of the enterprise, the revenue scale, the staff scale, the tax income, the financial statement and the like); 3) associated enterprise related information (same-subject enterprise information) of the subject enterprise; 4) business information (e.g., manufacturing, scientific, and trade enterprises); 5) the applicant or the enterprise main body network public opinion information (websites, public numbers, search engines and the like) is standardized into a table or a graph, and content links are formed and displayed in the form of titles and covers. For example, the core financial index titled 3 rd quarter of the latest financial year of the enterprise, and the approver can click on the cover to view the content of the corresponding index.
Step 2: characteristic collection: collecting behavior data, environment data and an audited object feature set of auditors;
the behavior data of the approvers includes: browsing records of the content material such as click rate, dwell time, click sequence and the like by an approver; the environmental data includes: login time, current approval node location, behavioral data of other approvals, approval time, approved applicants, and enterprise attributes. The feature set of the audited object comprises: personal information (such as sex, age and position) of the audited person, basic information (such as industry, name, tax payment scale and the like) of the company and integrity of the data.
And step 3: characteristic processing: processing the characteristics collected in the step (2) into standardized characteristic vectors, marking content preference labels (if the A materials are clicked, the A materials are regarded as more importance when an approver conducts credit check, the click quantity can indicate the importance degree of the materials in the credit check), and using the normalized characteristic vectors as the input of the model;
and 4, step 4: model training: inputting the standardized feature vector obtained after the processing in the step 3 into a combined model formed by a GBDT model and a logistic regression model for training to form a mapping relation between the feature vector and the content material in the step 1;
and 5: model prediction: and (3) processing the user information to be predicted by the characteristics in the step (3) to obtain a standardized characteristic vector, firstly inputting the standardized characteristic vector into a GBDT model, obtaining a discrete characteristic vector by the GBDT model through a method for screening characteristics and combined characteristics, then inputting the discrete characteristic vector into a logistic regression model, predicting the click rate of a auditor on material contents according to the mapping relation between the discrete characteristic vector and the material contents in the step (1) by the logistic regression model, and screening and outputting the material contents n (generally, the top 10) before ranking as recommended contents displayed to the user.
The invention also provides a manual credit review information recommendation system based on machine learning, which comprises the following steps:
an online feature acquisition module: the system is used for acquiring user information to be predicted in real time;
an offline feature collection module: the system comprises a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is used for acquiring behavior data, environment data and an audited object feature set of auditors;
a characteristic processing module: the system comprises an online characteristic acquisition module, an offline characteristic collection module, a standard characteristic vector generation module and a labeling module, wherein the online characteristic acquisition module is used for acquiring information of a user;
a recommendation module: and outputting the material content of the top n of the click rate ranking according to the input standardized feature vector.
The above-mentioned embodiments only express the specific embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for those skilled in the art, without departing from the technical idea of the present application, several changes and modifications can be made, which are all within the protection scope of the present application.
Claims (6)
1. A manual credit review information recommendation method based on machine learning is characterized by comprising the following steps:
step 1: making a content material: carrying out standardization processing on data information of an object to be checked, generating table and graphic materials, and forming corresponding content links;
step 2: characteristic collection: collecting behavior data, environment data and an audited object feature set of auditors;
and step 3: characteristic processing: processing the features collected in the step 2 into standardized feature vectors and marking content preference labels;
and 4, step 4: model training: inputting the standardized feature vector obtained after the processing in the step 3 into a combined model formed by a GBDT model and a logistic regression model for training to form a mapping relation between the feature vector and the content material;
and 5: model prediction: and (3) after the user information to be predicted is subjected to the characteristic processing in the step (3) to obtain a standardized characteristic vector, inputting the standardized characteristic vector into the model trained in the step (4), and outputting content materials of n before the predicted click rate ranking, wherein n is an integer larger than 0.
2. The method for recommending manual credit review information based on machine learning according to claim 1, wherein in step 1, the content material includes information related to the applicant, information related to the main enterprise of the applicant, information related to the related enterprise of the main enterprise, information related to the industry of the enterprise, and information related to the network public opinion of the applicant or the main enterprise.
3. The manual credit review information recommendation method based on machine learning of claim 2 wherein the applicant related information includes the applicant's age, gender, position, academic history and personal liability size; the related information of the applicant main body enterprise comprises the region of the applicant main body enterprise, the earning scale, the staff scale, the tax income and the financial statement; the related enterprise information of the main enterprise comprises the region of the related enterprise, the revenue scale, the staff scale, the tax income and the financial statement.
4. The manual credit review information recommendation method based on machine learning according to claim 1, wherein the behavior data of the approver in step 2 includes the click rate of the approver on the content material, the dwell time recorded by a timer, and the click sequence; the environment data comprises log-in time recorded by a timer, the current position of the approval node, behavior data of other approvals, approval time, the applicant to be approved and enterprise attributes.
5. The machine learning-based manual credit review information recommendation method according to claim 1, wherein the specific steps of step 5 are: and inputting the standardized feature vector into a GBDT model, obtaining a discrete feature vector by the GBDT model through a method of screening features and combining features, inputting the discrete feature vector into a logistic regression model, predicting the click rate of the user on the content material according to the mapping relation between the discrete feature vector and the content material by the logistic regression model, and screening and outputting the content material n before the click rate is ranked.
6. A machine learning-based manual credit review information recommendation system for use in the method of any of claims 1 to 5, comprising:
an online feature acquisition module: the system is used for acquiring user information to be predicted in real time;
an offline feature collection module: the system comprises a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is used for acquiring behavior data, environment data and an audited object feature set of auditors;
a characteristic processing module: the system comprises an online characteristic acquisition module, an offline characteristic collection module, a standard characteristic vector generation module and a labeling module, wherein the online characteristic acquisition module is used for acquiring information of a user;
a recommendation module: and outputting the material content of the top n of the click rate ranking according to the input standardized feature vector.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111536543.1A CN114299350B (en) | 2021-12-15 | 2021-12-15 | Manual credit auditing information recommendation method and system based on machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111536543.1A CN114299350B (en) | 2021-12-15 | 2021-12-15 | Manual credit auditing information recommendation method and system based on machine learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114299350A true CN114299350A (en) | 2022-04-08 |
CN114299350B CN114299350B (en) | 2024-08-27 |
Family
ID=80967563
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111536543.1A Active CN114299350B (en) | 2021-12-15 | 2021-12-15 | Manual credit auditing information recommendation method and system based on machine learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114299350B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116579749A (en) * | 2023-07-13 | 2023-08-11 | 浙江保融科技股份有限公司 | Method and device for running auditing flow based on RPA robot |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109829116A (en) * | 2019-02-14 | 2019-05-31 | 北京达佳互联信息技术有限公司 | A kind of content recommendation method, device, server and computer readable storage medium |
US20200005196A1 (en) * | 2018-06-27 | 2020-01-02 | Microsoft Technology Licensing, Llc | Personalization enhanced recommendation models |
CN113761348A (en) * | 2021-02-26 | 2021-12-07 | 北京沃东天骏信息技术有限公司 | Information recommendation method and device, electronic equipment and storage medium |
-
2021
- 2021-12-15 CN CN202111536543.1A patent/CN114299350B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200005196A1 (en) * | 2018-06-27 | 2020-01-02 | Microsoft Technology Licensing, Llc | Personalization enhanced recommendation models |
CN109829116A (en) * | 2019-02-14 | 2019-05-31 | 北京达佳互联信息技术有限公司 | A kind of content recommendation method, device, server and computer readable storage medium |
CN113761348A (en) * | 2021-02-26 | 2021-12-07 | 北京沃东天骏信息技术有限公司 | Information recommendation method and device, electronic equipment and storage medium |
Non-Patent Citations (2)
Title |
---|
XINRAN HE等: "Practical Lessons from Predicting Clicks on Ads at Facebook", 《PROCEEDINGS OF THE EIGHTH INTERNATIONAL WORKSHOP ON DATA MINING FOR ONLINE ADVERTISING》, 24 August 2014 (2014-08-24), pages 1 - 9, XP058056336, DOI: 10.1145/2648584.2648589 * |
陆炳杨: "基于树集成模型的P2P网贷平台借款人信用风险研究", 《中国优秀硕士学位论文全文数据库:经济与管理科学辑》, no. 2020, 15 January 2020 (2020-01-15), pages 157 - 420 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116579749A (en) * | 2023-07-13 | 2023-08-11 | 浙江保融科技股份有限公司 | Method and device for running auditing flow based on RPA robot |
CN116579749B (en) * | 2023-07-13 | 2023-11-14 | 浙江保融科技股份有限公司 | Method and device for running auditing flow based on RPA robot |
Also Published As
Publication number | Publication date |
---|---|
CN114299350B (en) | 2024-08-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Veh et al. | Corporate reputation in management research: a review of the literature and assessment of the concept | |
Adegboyegun et al. | Integrated reporting and corporate performance in Nigeria: Evidence from the banking industry | |
CN108154401B (en) | User portrait depicting method, device, medium and computing equipment | |
US11276007B2 (en) | Method and system for composite scoring, classification, and decision making based on machine learning | |
Terblanche et al. | The influence of integrated reporting and internationalisation on intellectual capital disclosures | |
Bachmann et al. | The impact of international outsourcing on labour market dynamics in Germany | |
CN107851097B (en) | Data analysis system, data analysis method, data analysis program, and storage medium | |
Cheung et al. | A multi-perspective knowledge-based system for customer service management | |
CN109558429A (en) | The two-way recommendation method and system of talent service based on internet big data | |
US20130254298A1 (en) | Method and collaboration system | |
KR20200030252A (en) | Apparatus and method for providing artwork | |
Al-Dhamari et al. | Audit partners gender, auditor quality and clients value relevance | |
WO2021042006A1 (en) | Data driven systems and methods for optimization of a target business | |
CN110009503A (en) | Finance product recommended method, device, computer equipment and storage medium | |
US20220261819A1 (en) | System and method for determining and managing environmental, social, and governance (esg) perception of entities and industries through use of survey and media data | |
CN113869624A (en) | Investment information providing system and method | |
CN115619571A (en) | Financing planning method, system and device | |
CN112102006A (en) | Target customer acquisition method, target customer search method and target customer search device based on big data analysis | |
Lin et al. | A computer-based approach for analyzing consumer demands in electronic word-of-mouth | |
CN114299350A (en) | Artificial credit auditing information recommendation method and system based on machine learning | |
Hazarika et al. | Are numeric ratings true representations of reviews? A study of inconsistency between reviews and ratings | |
CN115829205A (en) | Method and device for monitoring due-employment level of manual credit examiner for assisting credit of small and micro enterprises | |
Nikolić et al. | Application of FAHP–PROMETHEE hybrid model for prioritizing SMEs failure factors | |
KR101853555B1 (en) | SW Market Information Service System | |
Maryam et al. | Evaluate implementation of enterprise resource planning (ERP) system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |