CN119323472A - Evaluation method and device for credit wind control data source, electronic equipment and computer program product - Google Patents

Evaluation method and device for credit wind control data source, electronic equipment and computer program product Download PDF

Info

Publication number
CN119323472A
CN119323472A CN202411212362.7A CN202411212362A CN119323472A CN 119323472 A CN119323472 A CN 119323472A CN 202411212362 A CN202411212362 A CN 202411212362A CN 119323472 A CN119323472 A CN 119323472A
Authority
CN
China
Prior art keywords
data source
evaluation
index
external data
wind control
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202411212362.7A
Other languages
Chinese (zh)
Inventor
张俸洋
段悦伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Longyingzhida Beijing Technology Co ltd
Original Assignee
Longyingzhida Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Longyingzhida Beijing Technology Co ltd filed Critical Longyingzhida Beijing Technology Co ltd
Priority to CN202411212362.7A priority Critical patent/CN119323472A/en
Publication of CN119323472A publication Critical patent/CN119323472A/en
Pending legal-status Critical Current

Links

Landscapes

  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

本申请公开了一种信贷风控数据源的评价方法、装置及电子设备、计算机程序产品,该方法包括:获取信贷风控数据及用户的评价配置信息,信贷风控数据包括内部数据源及多个外部数据源的信贷风控数据;根据评价配置信息对应的基础指标评价策略分别对多个外部数据源的信贷风控数据进行处理,得到每个外部数据源的基础指标评分;基于内外部的信贷风控数据构建风险预测模型,并结合评价配置信息对应的模型增益评价策略确定每个外部数据源的模型增益评分;根据每个外部数据源的基础指标评分和模型增益评分,确定最终评价结果并输出。本申请提高了信贷风控数据源评价的准确性、灵活性和评价效率,实现了外部数据源之间的横向比较,有利于应用人员决策使用。

The present application discloses a method, device, electronic device, and computer program product for evaluating a credit risk control data source, the method comprising: obtaining credit risk control data and user evaluation configuration information, the credit risk control data comprising credit risk control data from an internal data source and multiple external data sources; processing the credit risk control data from multiple external data sources respectively according to the basic indicator evaluation strategy corresponding to the evaluation configuration information, and obtaining the basic indicator score of each external data source; constructing a risk prediction model based on internal and external credit risk control data, and determining the model gain score of each external data source in combination with the model gain evaluation strategy corresponding to the evaluation configuration information; determining and outputting the final evaluation result according to the basic indicator score and model gain score of each external data source. The present application improves the accuracy, flexibility, and evaluation efficiency of the credit risk control data source evaluation, realizes horizontal comparison between external data sources, and is beneficial to the decision-making use of application personnel.

Description

Evaluation method and device for credit wind control data source, electronic equipment and computer program product
Technical Field
The application relates to the technical field of credit wind control, in particular to a credit wind control data source evaluation method and device, electronic equipment and a computer program product.
Background
In the field of credit risk management, financial institutions and third party service providers typically need to risk evaluate and manage large amounts of three-party credit data, involving consideration of multiple aspects of borrower credit history, repayment capabilities, financial status, and the like.
The existing implementation scheme mainly comprises the following two types:
1) And (3) manual testing and evaluation, namely taking statistical analysis software such as SAS, python and the like as an analysis tool, and carrying out external data testing in a mode of manually checking data and generating reports, wherein the mode is low in efficiency and easy to make mistakes.
2) And in the automatic test evaluation, the test report is automatically generated in the solidification test process, and the mode often lacks flexibility, cannot adjust the analysis strategy according to the characteristics of different data sets, and cannot well adapt to different scenes and specific requirements.
Disclosure of Invention
The embodiment of the application provides a credit wind control data source evaluation method and device, electronic equipment and a computer program product, so as to improve the efficiency, accuracy and flexibility of external data source evaluation in a credit wind control scene.
The embodiment of the application adopts the following technical scheme:
In a first aspect, an embodiment of the present application provides a method for evaluating a credit wind control data source, where the method for evaluating a credit wind control data source includes:
acquiring credit wind control data and evaluation configuration information of a user, wherein the credit wind control data comprises credit wind control data of an internal data source and credit wind control data of a plurality of external data sources;
respectively processing credit wind control data of a plurality of external data sources according to a basic index evaluation strategy corresponding to the evaluation configuration information to obtain basic index scores of each external data source;
Constructing a risk prediction model based on the credit wind control data of the internal data source and the credit wind control data of a plurality of external data sources, and determining a model gain score of each external data source by using a model gain evaluation strategy corresponding to the evaluation configuration information based on the risk prediction model;
and determining and outputting a final evaluation result of each external data source according to the basic index score and the model gain score of each external data source.
Optionally, the basic index evaluation policy includes a plurality of index dimensions, and the processing the credit wind control data of the plurality of external data sources according to the basic index evaluation policy corresponding to the evaluation configuration information, to obtain a basic index score of each external data source includes:
According to credit wind control data of a plurality of external data sources, respectively calculating index scores of each external data source in each index dimension;
And summing the index scores of each external data source in each index dimension to obtain a basic index score of each external data source.
Optionally, the basic index evaluation policy further includes an index weight, an index value interval, and a corresponding interval score, and the calculating, according to credit wind control data of a plurality of external data sources, the index score of each external data source in each index dimension includes:
Determining the interval distribution proportion of the credit wind control data in each index dimension according to the value of the credit wind control data corresponding to each index dimension and the index value interval;
And calculating the index score of each index dimension according to the index weight, the index value interval, the interval score and the interval distribution proportion of the credit wind control data in each index dimension.
Optionally, the constructing a risk prediction model based on the credit wind control data of the internal data source and the credit wind control data of the plurality of external data sources includes:
training a first risk prediction model using credit wind control data of the internal data source;
Respectively fusing the credit wind control data of the internal data source with the credit wind control data of a plurality of external data sources to obtain a plurality of fused credit wind control data;
and training a second risk prediction model by utilizing the fused credit wind control data respectively.
Optionally, the model gain evaluation policy includes a model gain evaluation index, and determining, based on the risk prediction model, a model gain score of each external data source using the model gain evaluation policy corresponding to the evaluation configuration information includes:
respectively evaluating the risk prediction effect of the first risk prediction model and the risk prediction effect of the second risk prediction model corresponding to each external data source by using the model gain evaluation index;
And calculating a model gain score of each external data source according to the risk prediction effect of the first risk prediction model and the risk prediction effect of the second risk prediction model corresponding to each external data source.
Optionally, the model gain evaluation policy further includes an index value lifting section of the model gain evaluation index and a corresponding section score, and calculating the model gain score of each external data source according to the risk prediction effect of the first risk prediction model and the risk prediction effect of the second risk prediction model corresponding to each external data source includes:
According to the risk prediction effect of the first risk prediction model and the risk prediction effect of the second risk prediction model corresponding to each external data source, respectively calculating the index lifting value of the model gain evaluation index corresponding to each external data source;
And calculating the model gain score of each external data source according to the index lifting value, the index lifting interval and the corresponding interval score of the model gain evaluation index corresponding to each external data source.
Optionally, determining a final evaluation result of each external data source according to the base index score and the model gain score of each external data source and outputting the final evaluation result comprises:
Calculating the comprehensive score of each external data source according to the basic index score and the model gain score of each external data source;
based on the composite score of each external data source, an evaluation report is generated and output using a preset report generation model, the evaluation report including an evaluation report of each external data source and/or a comparative analysis report between external data sources.
Optionally, after determining a final evaluation result of each external data source according to the base index score and the model gain score of each external data source and outputting the final evaluation result, the credit wind control data source evaluation method further comprises:
Receiving evaluation configuration adjustment information of a user;
and updating the evaluation report according to the evaluation configuration adjustment information and outputting the evaluation report.
In a second aspect, an embodiment of the present application further provides an evaluation device for a credit wind control data source, where the evaluation device for a credit wind control data source includes:
an acquisition unit configured to acquire credit management data including credit management data of an internal data source and credit management data of a plurality of external data sources, and evaluation configuration information of a user;
The first scoring unit is used for respectively processing the credit wind control data of the plurality of external data sources according to the basic index evaluation strategy corresponding to the evaluation configuration information to obtain basic index scores of each external data source;
a second scoring unit, configured to construct a risk prediction model based on credit wind control data of the internal data source and credit wind control data of a plurality of external data sources, and determine a model gain score of each external data source by using a model gain evaluation policy corresponding to the evaluation configuration information based on the risk prediction model;
And the determining unit is used for determining and outputting a final evaluation result of each external data source according to the basic index score and the model gain score of each external data source.
In a third aspect, an embodiment of the present application further provides an electronic device, including:
Processor, and
A memory arranged to store computer executable instructions that, when executed, cause the processor to perform the method of evaluating any of the credit wind control data sources described above.
In a fourth aspect, embodiments of the present application also provide a computer-readable storage medium storing one or more programs that, when executed by an electronic device comprising a plurality of application programs, cause the electronic device to perform the method of evaluating a credit wind control data source of any of the foregoing.
The method for evaluating the credit wind control data source has the advantages that credit wind control data and evaluation configuration information of users are firstly obtained, the credit wind control data comprise credit wind control data of internal data sources and credit wind control data of a plurality of external data sources, then the credit wind control data of the external data sources are respectively processed according to basic index evaluation strategies corresponding to the evaluation configuration information to obtain basic index scores of the external data sources, then a risk prediction model is built based on the credit wind control data of the internal data sources and the credit wind control data of the external data sources, and model gain scores of the external data sources are determined by using model gain evaluation strategies corresponding to the evaluation configuration information, and finally final evaluation results of the external data sources are determined and output according to the basic index scores and the model gain scores of the external data sources. The evaluation method of the credit wind control data source improves the accuracy and flexibility of evaluation of different external data sources in a credit wind control scene, realizes the transverse comparison function between different external data sources, is beneficial to decision-making and use of application personnel, can be automatically realized in the whole process, does not need manual operation, and improves the evaluation and application efficiency of the data sources.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a flow chart of a method for evaluating a credit wind control data source in an embodiment of the application;
FIG. 2 is a schematic diagram of a credit wind control data source evaluation device according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The following describes in detail the technical solutions provided by the embodiments of the present application with reference to the accompanying drawings.
The embodiment of the application provides a method for evaluating a credit wind control data source, as shown in fig. 1, and provides a flow diagram of the method for evaluating the credit wind control data source in the embodiment of the application, wherein the method for evaluating the credit wind control data source at least comprises the following steps S110 to S140:
Step S110, obtaining credit wind control data and evaluation configuration information of the user, wherein the credit wind control data comprises credit wind control data of an internal data source and credit wind control data of a plurality of external data sources.
The credit wind control data source to be evaluated in the embodiment of the application mainly refers to a three-party data source, wherein the three-party data source refers to processed data with own characteristics formed by an external third party platform after long-term business accumulation, and the data possibly comprise multi-head loan data, payment data, credit evaluation classification data and the like, and play an important role in the fields of financial wind control, market analysis and the like.
When evaluating the credit management data source, the credit management data need to be acquired first, the credit management data can be specifically divided into an internal data source and a plurality of credit management data of an external data source, the internal data source mainly refers to data generated and accumulated by a credit agency in the business operation process, and the data usually has high specificity and real-time property and is an important basis for the credit agency to know the credit condition of borrowers and evaluate the loan risk. The external data sources refer to the three data sources, and the external credit wind control data sources are various, so that comprehensive evaluation and analysis of the advantages and disadvantages of the external data sources are required.
In addition, the evaluation configuration information of the user needs to be obtained, that is, the service party with the data source evaluation requirement can flexibly configure the corresponding evaluation strategy according to the service requirement and the use situation of the service party, for example, the key index to be evaluated, the output of the evaluation result and the like can be configured.
And step S120, credit wind control data of a plurality of external data sources are respectively processed according to a basic index evaluation strategy corresponding to the evaluation configuration information, so that basic index scores of each external data source are obtained.
The embodiment of the application mainly divides the evaluation of external data sources into two aspects, one aspect is the evaluation of basic indexes, and because different evaluation configuration information can influence the actually adopted evaluation strategy, the currently adopted basic index evaluation strategy is required to be determined according to the evaluation configuration information, and the basic index evaluation strategy is utilized to respectively carry out statistical analysis on the credit wind control data of each external data source, so as to obtain the basic index score of each external data source.
Step S130, a risk prediction model is built based on the credit wind control data of the internal data source and the credit wind control data of a plurality of external data sources, and a model gain score of each external data source is determined by using a model gain evaluation strategy corresponding to the evaluation configuration information based on the risk prediction model.
The other aspect of the external data source evaluation in the embodiment of the application is the evaluation of the model gain effect, wherein the evaluation of the model gain effect is mainly to evaluate the improvement capability of each external data source to the prediction effect of the wind control related model, so that the risk prediction model is constructed by utilizing the credit wind control data of the internal data source and a plurality of external data sources, and then the model gain score of each external data source is calculated by utilizing the model gain evaluation strategy corresponding to the evaluation configuration information based on the risk prediction model.
And step S140, determining and outputting a final evaluation result of each external data source according to the basic index score and the model gain score of each external data source.
The above basic index score reflects the own merits of the data of the external data source on some key indexes, and the model gain score reflects the improvement capability of the external data source to the prediction effect of the existing wind control model, so that the embodiment of the application can integrate the basic index score and the model gain score of each external data source to determine and output the final evaluation result of each external data source, thereby being used by the service side for interpretation and decision.
The evaluation method of the credit wind control data source improves the accuracy and flexibility of evaluation of different external data sources in a credit wind control scene, realizes the transverse comparison function between different external data sources, is beneficial to decision-making and use of application personnel, can be automatically realized in the whole process, does not need manual operation, and improves the evaluation and application efficiency of the data sources.
In some embodiments of the present application, when credit wind control data is acquired, a data reading engine may be used to read, compatible with CSV and XLSX formats, supporting batch data importation, merging and pooling.
In addition, as the data sources are various, the problems that the data may have different data quality such as data missing, error input, inconsistent formats and the like are not considered in the traditional mode, the accuracy of risk assessment is directly affected, and in the traditional risk assessment process, the work such as data cleaning, statistical analysis and the like is often needed manually, so that the time and the labor are consumed, and the artificial error is easy to occur.
Based on the data, the embodiment of the application can also design a data analyzer, can identify and manually set the data type and structure, and can automatically map to the internal data model of the system. If the automatically identified field type is incorrect, manual modification is supported, such as "date of birth" being identified as a string format, and may be manually modified to a date format. Furthermore, a data checking mechanism can be introduced to check data format errors and potential data quality problems, so that the accuracy of subsequent data processing is ensured.
In some embodiments of the present application, the basic index evaluation policy includes a plurality of index dimensions, and the processing the credit wind control data of the plurality of external data sources according to the basic index evaluation policy corresponding to the evaluation configuration information to obtain a basic index score of each external data source includes calculating an index score of each external data source in each index dimension according to the credit wind control data of the plurality of external data sources, and summing the index scores of each external data source in each index dimension to obtain a basic index score of each external data source.
The embodiment of the application can define various evaluation indexes in advance according to the data characteristics of different external data sources and the requirements of the service, and can comprise the following index dimensions:
1) The rate of missing is calculated as the ratio of missing values in each column of data.
2) And (3) the concentration degree is evaluated by calculating the value with the maximum value of the value ratio of each column of data.
3) Correlation by using pearson correlation coefficient or spearman rank correlation coefficient, linear and nonlinear relations between variables are analyzed.
4) IV value (information value), evaluating the contribution of a single variable to the predictive force of a target variable, classifying the characteristic variable by using a supervised method such as decision tree classification, chi-square classification and the like, and calculating the evidence weight WOE of each classification, wherein the evidence weight WOE can be specifically expressed as follows:
bin_woe=ln((bin_bad/total_bad)*(total_good/bin_good))
Wherein, bin_bad is the bad sample number in a 1-minute bin of the variable, bin_good is the good sample number, total_bad is the total bad sample number, total_good is the total good sample number.
The final calculated variable IV value can be expressed in the following form:
bin_iv=(bin_bad/total_bad)-(total_good/bin_good)*bin_woe
IV=sum(bin_iv)
5) Stability, namely detecting the change trend of the data along with time through time sequence analysis, and ensuring the long-term stability of the data.
6) The checking rate is 1-deletion rate.
7) Accuracy/coverage (blacklist class) hit blacklist number/hit blacklist number in sample.
8) False rejection = hit blacklist number/early pass and post-credit good customer.
9) Effective variance ratio = number of hit blacklist/number of samples passed and bad sample size.
10 Invalid difference rate = number of hit blacklists/other reject rate in the sample.
It should be noted that the above evaluation indexes are merely examples of embodiments of the present application, and specific setting of which evaluation indexes may be flexibly set by those skilled in the art according to actual needs, which is not limited herein.
Based on the evaluation indexes, the index scores of each external data source in each index dimension can be calculated according to the credit wind control data of each external data source, and finally, the basic index scores of each data source are obtained by summarizing in the data source dimension.
In the specific calculation, the embodiment of the application can calculate the index value by adopting modes such as statistics, machine learning algorithm and the like, and the calculation mode system of each index is built in, so that the method and the adjustment of the related threshold value can be carried out through respective pages, and the calculation is carried out without writing codes. For example, the calculation of the correlation can be performed by selecting the calculation of the pearson algorithm, the spearman algorithm and the like, and the corresponding screening threshold can be set.
In some embodiments of the present application, the basic index evaluation policy further includes an index weight, an index value interval, and a corresponding interval score, and calculating the index score of each external data source in each index dimension according to the credit wind control data of the plurality of external data sources includes determining an interval distribution ratio of the credit wind control data in each index dimension according to the value of the credit wind control data corresponding to each index dimension and the index value interval, and calculating the index score of each index dimension according to the index weight, the index value interval, the interval score, and the interval distribution ratio of the credit wind control data in each index dimension.
The basic index evaluation strategy of the embodiment of the application further comprises index weights, index value intervals and corresponding interval scores, and the score of each index dimension is calculated according to the index weight corresponding to each index dimension, the index value interval where the data variable falls, the interval score and the duty ratio of the data variable falling in the index value interval.
Taking variable data as an example, taking 100 minutes of full scale as shown in the following table 1, for a stability index, 80% of variables (number) fall in a section (0,0.05), the score of the section is 24 minutes, the sum of the scores of the stability index in all sections is the total score of the stability index, and finally the sum of the scores of all indexes of the current data source is the total score of the basic index of the data source.
TABLE 1
It should be noted that, table 1 above is merely an example of a specific implementation of the embodiment of the present application, and specific setting of the index weight, dividing the index value interval, and setting of the interval score, and those skilled in the art may flexibly adjust the setting according to actual requirements, which is not specifically limited herein.
In some embodiments of the present application, the constructing a risk prediction model based on the credit management data of the internal data source and the credit management data of the plurality of external data sources includes training a first risk prediction model using the credit management data of the internal data source, respectively fusing the credit management data of the internal data source with the credit management data of the plurality of external data sources to obtain a plurality of fused credit management data, respectively training a second risk prediction model using each fused credit management data.
When evaluating the model gain effect of an external data source, the external data source can be used for constructing a risk prediction model, wherein the construction of the model is divided into two aspects, namely, the model is independently modeled based on the existing internal data source, namely, a first risk prediction model is trained by using the existing internal data source. And on the other hand, the data of the internal data sources are respectively fused with the data of each external data source, and a plurality of second risk prediction models are obtained by respectively modeling a plurality of fused data sources.
The risk prediction model may be, for example, a model related to credit management, such as predicting the probability of user default or the probability of overdue expected, and specifically, what network structure is used for training, and those skilled in the art may flexibly select the model in combination with the prior art, which is not specifically limited herein.
In some embodiments of the present application, the model gain evaluation policy includes a model gain evaluation index, and determining, based on the risk prediction model, a model gain score for each external data source using the model gain evaluation policy corresponding to the evaluation configuration information includes evaluating, using the model gain evaluation index, a risk prediction effect of a first risk prediction model and a risk prediction effect of a second risk prediction model corresponding to each external data source, respectively, and calculating, based on the risk prediction effect of the first risk prediction model and the risk prediction effect of the second risk prediction model corresponding to each external data source, the model gain score for each external data source.
The embodiment of the application can adopt the model gain evaluation index which is defined in advance to evaluate the prediction effect of the model, wherein the model gain evaluation index can comprise an AUC index and a KS index, the AUC index is the Area Under the Curve (Area Under Curve), and the higher the AUC value is, the stronger the capability of the model of arranging positive samples in front of negative samples is, namely the higher the prediction accuracy of the model is. The KS index is used for measuring the distinguishing capability of the model on positive and negative samples, and the maximum difference between the cumulative distribution of the positive and negative samples is calculated. The larger the KS value, the more risk discrimination capability of the model.
And comparing the index value of the model gain evaluation index of the second risk prediction model corresponding to each external data source with the index value of the model gain evaluation index of the first risk prediction model respectively, and then analyzing the gain effect of each external data source on the risk prediction model, and calculating to obtain a model gain score.
In some embodiments of the present application, the model gain evaluation policy further includes an index value increasing section and a corresponding section score of the model gain evaluation index, and calculating the model gain score of each external data source according to the risk prediction effect of the first risk prediction model and the risk prediction effect of the second risk prediction model corresponding to each external data source includes calculating the index increasing value of the model gain evaluation index corresponding to each external data source according to the risk prediction effect of the first risk prediction model and the risk prediction effect of the second risk prediction model corresponding to each external data source, and calculating the model gain score of each external data source according to the index increasing value, the index value increasing section and the corresponding section score of the model gain evaluation index corresponding to each external data source.
The model gain evaluation strategy of the embodiment of the application further comprises an index value lifting section of the model gain evaluation index and a corresponding section score, namely, the lifting effect of the model is divided into a plurality of sections, and different sections correspond to different gain scores, so that the accuracy of model gain effect evaluation is improved.
As shown in table 2, for example, KS increased by 0.015 points, falling within the (0.01,0.02) interval, the corresponding KS gain score was 40 points, AUC increased by 0.025, falling within the (0.02,0.03) interval, the corresponding AUC gain score was 70 points, and then the model gain score=40×0.5+70×0.5=55.
TABLE 2
Model lifting KS gain score AUC gain score
[0,0.01] 10 10
(0.01,0.02] 40 40
(0.02,0.03] 70 70
(0.03,0.04] 90 90
(0.04,inf) 100 100
It should be noted that, table 2 above is merely an example of a specific implementation of the embodiment of the present application, and specific index value intervals are divided and interval scores are set, and those skilled in the art may flexibly adjust the values according to actual requirements, which is not limited herein.
In some embodiments of the application, determining and outputting the final evaluation result of each external data source according to the base index score and the model gain score of each external data source comprises calculating a composite score of each external data source according to the base index score and the model gain score of each external data source, generating and outputting an evaluation report comprising the evaluation report of each external data source and/or a comparative analysis report among external data sources by using a preset report generation model based on the composite score of each external data source.
After the base index score and the model gain score of each external data source are calculated based on the foregoing embodiments, the base index score and the model gain score of each external data source may be weighted and summed to obtain a composite score of each external data source. For example, it can be expressed in the following form:
composite score of data source = data source base indicator score =α+data source model gain score =β
α+β=1
Considering that the existing data source evaluation report generation process is usually very complex, the analysis results need to be manually summarized and detailed reports need to be written, and the process is time-consuming and difficult to ensure the consistency and the integrity of the reports. Based on the above, the embodiment of the application can automatically generate the corresponding evaluation report based on the evaluation result, can generate the evaluation report of a single data source, and can also generate the comparative analysis evaluation report among a plurality of data sources.
Specifically, when the evaluation report is generated, a report template can be preset, and finally output plates and formats of the report, such as a sample training test set data distribution plate, a standard reaching rate plate of each variable index, a test conclusion plate and the like, are designed. And (3) deploying the artificial intelligence large model offline, and automatically generating an evaluation report according to a preset report template and an index calculation result. The report may include the specific value of each index, the comparison result with the set threshold, the data column identifier exceeding the threshold, and the suggested measure. In addition, graphical displays such as bar charts, line charts, and thermodynamic diagrams may be provided to aid in the visual understanding of data quality conditions.
According to the embodiment of the application, the artificial intelligent large model is adopted to generate the evaluation report of the data source, on one hand, the flexibility of the conclusion of the evaluation report can be improved, because the large model can flexibly adjust the analysis logic according to a plurality of preset evaluation scenes and indexes, and can easily update the internal logic and rules thereof so as to adapt to new test standards and requirements, and extremely high flexibility is exhibited.
On the other hand, the large model has strong key capturing capability when generating an evaluation report. The method can be used for carrying out high-efficiency analysis on mass data through a deep learning algorithm, quickly identifying key problems in the evaluation process, giving special attention and description to the report, helping an application personnel to quickly locate the problems, and providing powerful guidance for subsequent repair and optimization work.
On the other hand, since the large model learns a great deal of natural language text in the training process, rich language expression skills and logic structures are mastered. Therefore, when the large model of the embodiment of the application generates the evaluation report conclusion, more proper language and sentence patterns can be used for clearly describing the test result and the problem. Meanwhile, the difficulty and the professionality of the speaking operation can be adjusted according to different backgrounds and demands of application personnel, and the report conclusion is accurate and easy to understand.
In addition, the embodiment of the application generates the evaluation report of the data source through the offline deployed artificial intelligent large model, can greatly save labor, improves the efficiency, accuracy and comprehensiveness of report generation, and can flexibly adapt to the requirements of complex and changeable data sources and business scenes through continuous learning and optimization.
In order to improve the display effect of the report, the embodiment of the application can divide the index, for example, the index can be divided into several categories shown in the following table 3, and the index calculation results of each data source under different category index dimensions are respectively displayed, so that a user can clearly and intuitively see the index scoring condition under each category of index.
TABLE 3 Table 3
List class index Variable class index Comprehensive scoring index
Loss rate Loss rate Loss rate
Concentration degree Concentration degree Concentration degree
IV IV IV
Yield of examination Correlation of Correlation of
Accuracy rate of Stability of Stability of
Error rate KS
Effective rate of difference
Ineffective rate of difference
Stability of
Further, for the generation of the comparison analysis evaluation report among a plurality of data sources, the data sources of the same class can be subjected to index ranking and comprehensive ranking display respectively based on the classification of the data sources such as multi-head class, payment class, app list and the like, and the data sources of the same class are transversely compared, so that the comparison analysis evaluation report is convenient for a decision maker to use. On the one hand, ranking can be performed according to indexes, and each key index such as the deletion rate, the concentration degree, the relevance, the IV value, the KS value, the stability and the like is respectively performed on all similar data sources, so that which data source performs optimally on the index is displayed. On the other hand, ranking can be performed according to comprehensive scores, the data sources are scored based on the comprehensive performances of all the key indexes, and finally overall ranking is performed according to the scoring results so as to compare the quality of the data sources, and a decision maker can adjust the weights of the key indexes to change ranking rules. Therefore, the method solves the problems that the existing evaluation method cannot quantitatively compare the advantages and disadvantages of similar data sources and the effect is not visual enough.
Of course, the manner in which the above-described evaluation report is generated is specifically adopted, and those skilled in the art can flexibly select in combination with the prior art, and is not specifically limited herein.
In some embodiments of the present application, after determining and outputting the final evaluation result of each external data source according to the base index score and the model gain score of each external data source, the evaluation method of the credit wind control data source further comprises receiving evaluation configuration adjustment information of a user, and updating and outputting the evaluation report according to the evaluation configuration adjustment information.
The embodiment of the application provides an adjusting function of evaluation configuration information, and can manually adjust each index threshold according to each index condition automatically calculated in the previous embodiment, thereby adjusting the content of report output, leading the generated evaluation report to be more suitable for interpretation and decision-making of service personnel, and improving the flexibility of generating the evaluation report. Therefore, the method solves the problems that the existing credit wind control data evaluation method only supports preset evaluation indexes and threshold values, lacks flexibility and cannot be well adapted to different scenes and specific requirements.
In the embodiment of the application, variable filtering flows can be set in the calculation stage, each flow is configured with different calculation indexes, and the passing number and the passing rate of each step are displayed.
The embodiment of the application also provides a credit wind control data source evaluation device 200, as shown in fig. 2, and provides a schematic structural diagram of the credit wind control data source evaluation device in the embodiment of the application, where the credit wind control data source evaluation device 200 includes an acquisition unit 210, a first scoring unit 220, a second scoring unit 230 and a determining unit 240, where:
An obtaining unit 210 for obtaining credit wind control data and evaluation configuration information of a user, the credit wind control data including credit wind control data of an internal data source and credit wind control data of a plurality of external data sources;
a first scoring unit 220, configured to process credit wind control data of a plurality of external data sources according to a basic index evaluation policy corresponding to the evaluation configuration information, so as to obtain a basic index score of each external data source;
A second scoring unit 230, configured to construct a risk prediction model based on the credit wind control data of the internal data source and the credit wind control data of the plurality of external data sources, and determine a model gain score of each external data source by using a model gain evaluation policy corresponding to the evaluation configuration information based on the risk prediction model;
And a determining unit 240, configured to determine and output a final evaluation result of each external data source according to the base index score and the model gain score of each external data source.
In some embodiments of the present application, the basic index evaluation policy includes a plurality of index dimensions, and the first scoring unit 220 is specifically configured to calculate index scores of each external data source in each index dimension according to credit wind control data of a plurality of external data sources, and sum the index scores of each external data source in each index dimension to obtain a basic index score of each external data source.
In some embodiments of the present application, the basic index evaluation policy further includes an index weight, an index value interval, and a corresponding interval score, and the first scoring unit 220 is specifically configured to determine an interval distribution ratio of the credit wind control data in each index dimension according to a value of the credit wind control data corresponding to each index dimension and the index value interval, and calculate an index score of each index dimension according to the index weight, the index value interval, the interval score, and the interval distribution ratio of the credit wind control data in each index dimension corresponding to each index dimension.
In some embodiments of the present application, the second scoring unit 230 is specifically configured to train the first risk prediction model by using the credit wind control data of the internal data source, fuse the credit wind control data of the internal data source with the credit wind control data of the plurality of external data sources respectively, to obtain a plurality of fused credit wind control data, and train the second risk prediction model by using each fused credit wind control data respectively.
In some embodiments of the present application, the model gain evaluation policy includes a model gain evaluation index, and the second scoring unit 230 is specifically configured to evaluate a risk prediction effect of a first risk prediction model and a risk prediction effect of a second risk prediction model corresponding to each external data source by using the model gain evaluation index, and calculate a model gain score of each external data source according to the risk prediction effect of the first risk prediction model and the risk prediction effect of the second risk prediction model corresponding to each external data source.
In some embodiments of the present application, the model gain evaluation policy further includes an index value increasing section and a corresponding section score of the model gain evaluation index, and the second scoring unit 230 is specifically configured to calculate an index increasing value of the model gain evaluation index corresponding to each external data source according to the risk prediction effect of the first risk prediction model and the risk prediction effect of the second risk prediction model corresponding to each external data source, and calculate the model gain score of each external data source according to the index increasing value, the index value increasing section and the corresponding section score of the model gain evaluation index corresponding to each external data source.
In some embodiments of the present application, the determining unit 240 is specifically configured to calculate a composite score of each external data source according to the base index score and the model gain score of each external data source, and generate and output an evaluation report including the evaluation report of each external data source and/or a comparative analysis report between external data sources using a preset report generation model based on the composite score of each external data source.
In some embodiments of the present application, the credit wind control data source evaluation device 200 further includes a receiving unit for receiving user's evaluation configuration adjustment information after determining and outputting a final evaluation result of each external data source according to the base index score and the model gain score of each external data source, and an updating unit for updating and outputting the evaluation report according to the evaluation configuration adjustment information.
It can be understood that the above-mentioned evaluation device for a credit wind control data source can implement each step of the evaluation method for a credit wind control data source provided in the foregoing embodiment, and the relevant explanation about the evaluation method for a credit wind control data source is applicable to the evaluation device for a credit wind control data source, which is not repeated herein.
Fig. 3 is a schematic structural view of an electronic device according to an embodiment of the present application. Referring to fig. 3, at the hardware level, the electronic device includes a processor, and optionally an internal bus, a network interface, and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory (non-volatile Memory), such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.
The processor, network interface, and memory may be interconnected by an internal bus, which may be an ISA (Industry Standard Architecture ) bus, a PCI (PERIPHERAL COMPONENT INTERCONNECT, peripheral component interconnect standard) bus, or EISA (Extended Industry Standard Architecture ) bus, among others. The buses may be classified as address buses, data buses, control buses, etc. For ease of illustration, only one bi-directional arrow is shown in FIG. 3, but not only one bus or type of bus.
And the memory is used for storing programs. In particular, the program may include program code including computer-operating instructions. The memory may include memory and non-volatile storage and provide instructions and data to the processor.
The processor reads the corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to form the evaluation device of the credit wind control data source on the logic level. The processor is used for executing the programs stored in the memory and is specifically used for executing the following operations:
acquiring credit wind control data and evaluation configuration information of a user, wherein the credit wind control data comprises credit wind control data of an internal data source and credit wind control data of a plurality of external data sources;
respectively processing credit wind control data of a plurality of external data sources according to a basic index evaluation strategy corresponding to the evaluation configuration information to obtain basic index scores of each external data source;
Constructing a risk prediction model based on the credit wind control data of the internal data source and the credit wind control data of a plurality of external data sources, and determining a model gain score of each external data source by using a model gain evaluation strategy corresponding to the evaluation configuration information based on the risk prediction model;
and determining and outputting a final evaluation result of each external data source according to the basic index score and the model gain score of each external data source.
The method performed by the evaluation device of the credit wind control data source disclosed in the embodiment of fig. 1 of the present application can be applied to or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or by instructions in the form of software. The Processor may be a general-purpose Processor including a central processing unit (Central Processing Unit, CPU), a network Processor (Network Processor, NP), etc., or may be a digital signal Processor (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.
The embodiment of the present application also proposes a computer-readable storage medium storing one or more programs, the one or more programs including instructions that, when executed by an electronic device including a plurality of application programs, enable the electronic device to perform a method performed by an evaluation apparatus for credit-wind-control data sources in the embodiment shown in fig. 1, and specifically configured to perform:
acquiring credit wind control data and evaluation configuration information of a user, wherein the credit wind control data comprises credit wind control data of an internal data source and credit wind control data of a plurality of external data sources;
respectively processing credit wind control data of a plurality of external data sources according to a basic index evaluation strategy corresponding to the evaluation configuration information to obtain basic index scores of each external data source;
Constructing a risk prediction model based on the credit wind control data of the internal data source and the credit wind control data of a plurality of external data sources, and determining a model gain score of each external data source by using a model gain evaluation strategy corresponding to the evaluation configuration information based on the risk prediction model;
and determining and outputting a final evaluation result of each external data source according to the basic index score and the model gain score of each external data source.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims (11)

1. A method of evaluating a credit management data source, the method comprising:
acquiring credit wind control data and evaluation configuration information of a user, wherein the credit wind control data comprises credit wind control data of an internal data source and credit wind control data of a plurality of external data sources;
respectively processing credit wind control data of a plurality of external data sources according to a basic index evaluation strategy corresponding to the evaluation configuration information to obtain basic index scores of each external data source;
Constructing a risk prediction model based on the credit wind control data of the internal data source and the credit wind control data of a plurality of external data sources, and determining a model gain score of each external data source by using a model gain evaluation strategy corresponding to the evaluation configuration information based on the risk prediction model;
and determining and outputting a final evaluation result of each external data source according to the basic index score and the model gain score of each external data source.
2. The method for evaluating a credit wind control data source according to claim 1, wherein the basic index evaluation policy includes a plurality of index dimensions, and the processing the credit wind control data of a plurality of external data sources according to the basic index evaluation policy corresponding to the evaluation configuration information to obtain a basic index score of each external data source includes:
According to credit wind control data of a plurality of external data sources, respectively calculating index scores of each external data source in each index dimension;
And summing the index scores of each external data source in each index dimension to obtain a basic index score of each external data source.
3. The method for evaluating a credit wind control data source according to claim 2, wherein the basic index evaluation policy further includes an index weight, an index value interval, and a corresponding interval score, and the calculating the index score of each external data source in each index dimension according to the credit wind control data of the plurality of external data sources includes:
Determining the interval distribution proportion of the credit wind control data in each index dimension according to the value of the credit wind control data corresponding to each index dimension and the index value interval;
And calculating the index score of each index dimension according to the index weight, the index value interval, the interval score and the interval distribution proportion of the credit wind control data in each index dimension.
4. The method of claim 1, wherein the constructing a risk prediction model based on the credit management data of the internal data source and the credit management data of the plurality of external data sources comprises:
training a first risk prediction model using credit wind control data of the internal data source;
Respectively fusing the credit wind control data of the internal data source with the credit wind control data of a plurality of external data sources to obtain a plurality of fused credit wind control data;
and training a second risk prediction model by utilizing the fused credit wind control data respectively.
5. The method of claim 4, wherein the model gain evaluation policy includes a model gain evaluation index, and wherein determining a model gain score for each external data source using the model gain evaluation policy corresponding to the evaluation configuration information based on the risk prediction model includes:
respectively evaluating the risk prediction effect of the first risk prediction model and the risk prediction effect of the second risk prediction model corresponding to each external data source by using the model gain evaluation index;
And calculating a model gain score of each external data source according to the risk prediction effect of the first risk prediction model and the risk prediction effect of the second risk prediction model corresponding to each external data source.
6. The method for evaluating a credit wind control data source according to claim 5, wherein the model gain evaluation strategy further includes an index value promotion section of a model gain evaluation index and a corresponding section score, and calculating the model gain score of each external data source according to the risk prediction effect of the first risk prediction model and the risk prediction effect of the second risk prediction model corresponding to each external data source includes:
According to the risk prediction effect of the first risk prediction model and the risk prediction effect of the second risk prediction model corresponding to each external data source, respectively calculating the index lifting value of the model gain evaluation index corresponding to each external data source;
And calculating the model gain score of each external data source according to the index lifting value, the index lifting interval and the corresponding interval score of the model gain evaluation index corresponding to each external data source.
7. The method of claim 1, wherein determining and outputting a final evaluation result for each external data source based on the base index score and the model gain score for each external data source comprises:
Calculating the comprehensive score of each external data source according to the basic index score and the model gain score of each external data source;
based on the composite score of each external data source, an evaluation report is generated and output using a preset report generation model, the evaluation report including an evaluation report of each external data source and/or a comparative analysis report between external data sources.
8. The method of claim 7, wherein after determining and outputting a final evaluation result for each external data source based on the base index score and the model gain score for each external data source, the method of evaluating a credit wind control data source further comprises:
Receiving evaluation configuration adjustment information of a user;
and updating the evaluation report according to the evaluation configuration adjustment information and outputting the evaluation report.
9. An evaluation device of a credit management data source, characterized in that the evaluation device of the credit management data source comprises:
an acquisition unit configured to acquire credit management data including credit management data of an internal data source and credit management data of a plurality of external data sources, and evaluation configuration information of a user;
The first scoring unit is used for respectively processing the credit wind control data of the plurality of external data sources according to the basic index evaluation strategy corresponding to the evaluation configuration information to obtain basic index scores of each external data source;
a second scoring unit, configured to construct a risk prediction model based on credit wind control data of the internal data source and credit wind control data of a plurality of external data sources, and determine a model gain score of each external data source by using a model gain evaluation policy corresponding to the evaluation configuration information based on the risk prediction model;
And the determining unit is used for determining and outputting a final evaluation result of each external data source according to the basic index score and the model gain score of each external data source.
10. An electronic device, comprising:
Processor, and
A memory arranged to store computer executable instructions that, when executed, cause the processor to perform the method of evaluating a credit wind control data source according to any of claims 1 to 8.
11. A computer program product comprising computer programs or instructions which when executed by a processor implement the method of evaluating a credit wind control data source according to any of claims 1 to 8.
CN202411212362.7A 2024-08-30 2024-08-30 Evaluation method and device for credit wind control data source, electronic equipment and computer program product Pending CN119323472A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411212362.7A CN119323472A (en) 2024-08-30 2024-08-30 Evaluation method and device for credit wind control data source, electronic equipment and computer program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411212362.7A CN119323472A (en) 2024-08-30 2024-08-30 Evaluation method and device for credit wind control data source, electronic equipment and computer program product

Publications (1)

Publication Number Publication Date
CN119323472A true CN119323472A (en) 2025-01-17

Family

ID=94228212

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411212362.7A Pending CN119323472A (en) 2024-08-30 2024-08-30 Evaluation method and device for credit wind control data source, electronic equipment and computer program product

Country Status (1)

Country Link
CN (1) CN119323472A (en)

Similar Documents

Publication Publication Date Title
Marks Are father’s or mother’s socioeconomic characteristics more important influences on student performance? Recent international evidence
CN108921569B (en) Method and device for determining complaint type of user
CN111242793B (en) Medical insurance data abnormality detection method and device
CN112711691A (en) Network public opinion guide effect data information processing method, system, terminal and medium
CN114519519A (en) Method, device and medium for assessing enterprise default risk based on GBDT algorithm and logistic regression model
CN113762579A (en) A model training method, device, computer storage medium and device
CN113191599A (en) Pipeline risk level evaluation method and device based on support vector machine
CN113919432A (en) Classification model construction method, data classification method and device
CN112434862B (en) Method and device for predicting financial dilemma of marketing enterprises
CN113506175A (en) Method, device, equipment and storage medium for optimizing risk early warning model of medium and small enterprises
CN118261294A (en) Intelligent analysis, diagnosis and prediction method and system for score
CN110059749B (en) Method and device for screening important features and electronic equipment
CN112184415A (en) Data processing method and device, electronic equipment and storage medium
CN119323472A (en) Evaluation method and device for credit wind control data source, electronic equipment and computer program product
CN115186776B (en) Method, device and storage medium for classifying ruby producing areas
CN111461932A (en) Administrative punishment discretion rationality assessment method and device based on big data
CN115564557A (en) Repayment capability evaluation model training method and device, electronic equipment and medium
CN111612626A (en) Method and device for preprocessing bond evaluation data
CN113128594A (en) Optimization method and equipment of evaluation model based on cross-domain data
CN114418186A (en) Population loss early warning method, device and equipment based on multi-dimensional feature fusion learning
CN113888318A (en) Risk detection method and system
CN112150276A (en) Training method, using method, device and equipment of machine learning model
CN114596152A (en) Method, device and storage medium for predicting debt subject default based on unsupervised model
KR102420952B1 (en) Loan expansion hypothesis testing system using artificial intelligence and method using the same
CN118365444A (en) Risk prediction method, apparatus, device and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination