WO2022062193A1

WO2022062193A1 - Individual credit assessment and explanation method and apparatus based on time sequence attribution analysis, and device and storage medium

Info

Publication number: WO2022062193A1
Application number: PCT/CN2020/135274
Authority: WO
Inventors: 程人; 李青山; 司华友
Original assignee: 南京博雅区块链研究院有限公司; 北京大学; 博雅正链(北京)科技有限公司
Priority date: 2020-09-28
Filing date: 2020-12-10
Publication date: 2022-03-31
Also published as: CN112215696A

Abstract

An individual credit assessment and explanation method and apparatus based on time sequence attribution analysis, and a device and a storage medium. The method comprises: constructing a credit scoring model; separately training the credit scoring model by using multiple groups of historical credit investigation data sets having time labels to obtain multiple historical credit scoring models; predicting, according to the types of the credit scoring models, multiple future credit scoring models on the basis of the multiple historical credit scoring models having time labels or the multiple groups of historical credit investigation data sets having the time labels; inputting credit investigation data to be assessed into a selected historical credit scoring model or future credit scoring model, so as to obtain a credit investigation assessment result of a credit investigation subject corresponding to said credit investigation data; and explaining the credit investigation assessment result. In the method, a series of credit scoring models are constructed for multiple historical time points and multiple future time points, credit assessment of a credit investigation subject at a specific time point can be realized by selecting an appropriate credit scoring model, and explanation with reference value is made for the assessment result, so that individual credit scores are guided to be improved.

Description

Personal credit evaluation and interpretation method, device, equipment and storage medium based on time series attribution analysis

technical field

The invention relates to the field of financial credit reporting, in particular to a personal credit evaluation and interpretation method, device, equipment and storage medium based on time series attribution analysis.

Background technique

The rapid development of information technologies such as big data and cloud computing has provided massive data and advanced technologies for the credit reporting business of the Jintouch industry. Among them, the personal credit reporting business based on Internet big data has great potential for development.

Using powerful machine learning classification models, existing personal credit scoring systems can already evaluate personal credit. However, these systems generally have a problem that has attracted much attention, that is, they cannot make valuable logical explanations for the evaluation results, so they cannot provide effective and feasible suggestions for credit reporting subjects to improve their credit scores.

Attribution analysis technology is an effective way to mine and identify the triggering factors of credit events in the field of credit reporting. At present, this technology has been applied to personal credit evaluation to try to make a valuable logical explanation for the evaluation results.

However, the main problem at present is that the core scoring model of the scoring system is trained based on the historical credit data of a certain period of time. However, the attributes of the credit reporting subject will change over time, and even some new attributes that have a significant impact on the evaluation results will appear. Using such a scoring model may even fail to obtain objective and valid evaluation results, let alone provide valuable logical explanations for the evaluation results.

SUMMARY OF THE INVENTION

In order to solve at least one of the above technical problems, a first aspect of the present invention provides a personal credit evaluation and interpretation method based on time series attribution analysis, and its specific technical scheme is as follows:

A personal credit evaluation and interpretation method based on time series attribution analysis, which includes:

Building a credit scoring model and initializing model parameters, the credit scoring model is a weighted scoring model or an unweighted scoring model;

The credit scoring models are separately trained by using several groups of time-labeled historical credit scoring models to obtain several trained time-labeled historical credit scoring models, wherein: each of the historical credit reporting data sets is Including multiple pieces of historical credit data, the historical credit data located in the same group has the same time label, and the historical credit data located in different groups has different time labels, and the time label represents the data of the historical credit data to which it belongs generation time;

According to the category of the credit scoring model, predict and obtain several future credit scoring models with time stamps based on the several historical credit scoring models with time stamps or the sets of historical credit investigation data sets with time stamps , wherein the time labels of each of the future credit scoring models are different;

Input the credit data to be evaluated into the selected historical credit scoring model or future credit scoring model with a time stamp to obtain the credit data of the credit subject corresponding to the credit data to be evaluated at the time point corresponding to the time stamp. credit evaluation results;

Explain the results of the credit evaluation.

A second aspect of the present invention provides a personal credit evaluation and interpretation device based on time series attribution analysis, which includes:

A model initialization module for constructing a credit scoring model and initializing model parameters, where the credit scoring model is a weighted scoring model or an unweighted scoring model;

The historical credit scoring model acquisition module is used to separately train the credit scoring model by using several groups of historical credit reporting data sets with time tags to obtain several trained historical credit scoring models with time tags, wherein: Each of the historical credit investigation data sets includes multiple pieces of historical credit investigation data, the historical credit investigation data in the same group has the same time tag, and the historical credit investigation data in different groups has different time tags, and the time tags represent The data generation time of the historical credit data to which it belongs;

The future credit scoring model acquisition module is used to predict and obtain a number of historical credit scoring models based on the several time-tagged historical credit scoring models or the several groups of time-tagged historical credit reporting data sets according to the category of the credit scoring model. A future credit scoring model with a time stamp, wherein the time stamps of each of the future credit scoring models are different;

The credit evaluation module is used to input the credit data to be evaluated into the selected historical credit scoring model or future credit scoring model with a time stamp, so as to obtain the credit reporting subject corresponding to the credit data to be evaluated at the time The credit evaluation result at the time point corresponding to the label;

The interpretation module is used to interpret the credit evaluation result.

A third aspect of the present invention provides an electronic device, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, the processor implementing the program described in the first aspect of the present invention when the processor executes the program A method of personal credit evaluation and interpretation based on time series attribution analysis.

A fourth aspect of the present invention provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the program is executed by a processor, the time-series attribution analysis-based algorithm described in the first aspect of the present invention is implemented. Personal credit assessment and interpretation methods.

Compared with the credit scoring model in the prior art, the present invention has the following significant advantages:

Build a series of credit scoring models over time for multiple historical points and multiple future points in time. Choosing an appropriate credit scoring model can realize the evaluation of the credit status of the credit reporting subject at a certain point in time, thereby significantly improving the evaluation effect and ensuring the interpretability of the evaluation results, thus providing some insights into how to improve personal credit scores. value reference.

Description of drawings

1 is a schematic flowchart of a personal credit evaluation and interpretation method based on time series attribution analysis in an embodiment of the present invention;

2 is a schematic flowchart of a personal credit evaluation and interpretation method based on time series attribution analysis in an embodiment of the present invention;

3 is a schematic flowchart of a personal credit evaluation and interpretation method based on time series attribution analysis in an embodiment of the present invention;

4 is a schematic flowchart of a personal credit evaluation and interpretation method based on time series attribution analysis in an embodiment of the present invention;

5 is a schematic flowchart of a personal credit evaluation and interpretation method based on time series attribution analysis in an embodiment of the present invention;

6 is a schematic structural diagram of an apparatus for evaluating and explaining personal credit based on time series attribution analysis in an embodiment of the present invention;

7 is a schematic structural diagram of an electronic device in an embodiment of the present invention;

8 is a data structure diagram of credit reporting data of a credit reporting subject in an embodiment of the present invention;

9 is a schematic diagram of a logic for obtaining a historical scoring model and a future scoring model in an embodiment of the present invention;

10 is a schematic diagram of a linear regression model of the attribute “debt ratio” in an embodiment of the present invention;

11 is a flowchart of an evaluation result interpretation process in an embodiment of the present invention;

FIG. 12 is a logic diagram of a method for obtaining several approximate sample data by perturbing the attribute value of each attribute of the credit data in an embodiment of the present invention.

detailed description

In order to make the above objects, features and advantages of the present invention more clearly understood, the present invention will be described in further detail below with reference to the accompanying drawings and specific embodiments.

Although the present invention provides method operation steps or device structures as shown in the following embodiments or drawings, more or less operation steps or module units may be included in the method or device based on routine or without creative work. . In the steps or structures that logically do not have a necessary causal relationship, the execution order of these steps or the module structure of the apparatus is not limited to the execution order or module structure shown in the embodiments of the present invention or the accompanying drawings. When the method or the module structure is applied in an actual device or terminal product, it can be executed sequentially or in parallel according to the method or the module structure shown in the embodiments or the accompanying drawings.

The core scoring model of the existing credit scoring system is obtained by training based on the historical credit data of a certain period of time. However, the attributes of the credit reporting subject will change over time, and even some new attributes that have a significant impact on the evaluation results will appear. Using the existing scoring models may not even be able to obtain objective and valid evaluation results, let alone provide valuable logical explanations for the evaluation results.

In view of the above-mentioned defects of the existing credit scoring models, the present invention provides a personal credit evaluation and interpretation method, device, equipment and storage medium based on time series attribution analysis, which is aimed at multiple historical time points and multiple futures. Time points build a series of credit scoring models over time. Choosing an appropriate credit scoring model can realize the evaluation of the credit status of the credit reporting subject at a specific point in the past, present or future, thereby significantly improving the evaluation effect and ensuring the interpretability of the evaluation results.

Before introducing the embodiments of the present invention, the following technical terms are explained:

Credit data: data related to the credit of the credit subject. The credit data includes several attributes (or called features), and each attribute has an attribute value. Each attribute value of each piece of credit information may be collected from one data source, or may be collected from several data sources. FIG. 8 shows the data format of the credit data of the credit reporting subject in one embodiment, the credit data includes 12 attributes collected from three data sources, and each attribute corresponds to an attribute value . The three data sources are financial institutions (such as banks), consumer platforms (such as Taobao, Meituan, etc.), and social networks. The first two data sources can directly reflect the credit reporting behavior of credit reporting subjects, and the introduction of social networks can improve the scoring accuracy of the scoring model to a certain extent.

Of course, when performing model training or credit evaluation, it is necessary to perform necessary preprocessing on the credit data, such as converting it into a vector form.

Time label: The time label of the credit data is used to indicate the generation time of the credit data. The span of the time label can be a year, a quarter, a month, etc. The span of the time label in the embodiment shown in FIG. 7 is a year, and the time label of the credit data shown in it is "2018", which means that the credit data is All attribute data are generated in 2018.

There are weighted scoring models and unweighted scoring models:

The scoring model can be expressed as Score=F(x), where: x is the attribute vector of the credit data of the credit reporting subject, F is the selected scoring model, and Score is the final credit score obtained by the scoring model.

If, the scoring model F can be expressed as the following form;

F(x)=α ₁ x ₁ +...α _n x _n ;

Among them: x _i is the attribute value of the i-th attribute of the credit reporting subject, α _i is the weight corresponding to the i-th attribute, and n is the number of attributes of the credit reporting subject.

Then the scoring model F is defined as a weighted scoring model.

Otherwise, the scoring model F is defined as an unweighted scoring model.

The weighted scoring model can choose a logistic regression model. As the simplest classification algorithm, logistic regression has always been the mainstream classification algorithm in the industry. It has the advantages of simplicity and stability, strong interpretability, and easy detection and deployment.

For the unweighted scoring model, you can choose algorithm models such as gradient boosting decision tree (GBDT) and deep neural network. Among them: Gradient Boosting Decision Tree (GBDT) is a kind of ensemble algorithm, and the basic learner adopts classification and regression tree. The advantage of this algorithm is that it has a prominent classification effect and can realize feature screening in the training process. A deep neural network can be understood as a neural network including multiple hidden layers, which can be adjusted in a very high dimension through huge parameters through techniques such as activation function and backpropagation, so that complex classification boundaries can be fully identified. achieve a good classification effect.

Method embodiment

As shown in FIG. 1 , the method for evaluating and explaining personal credit based on time series attribution analysis provided by an embodiment of the present invention includes the following steps:

S100. Build a credit scoring model and initialize model parameters, where the credit scoring model is a weighted scoring model or a weightless scoring model.

S200 , using several groups of historical credit reporting data sets with time tags to train the credit scoring models respectively to obtain several trained historical credit scoring models with time tags. Among them: each historical credit data set includes multiple pieces of historical credit data, the historical credit data in the same group has the same time label, the historical credit data in different groups has different time labels, and the time label represents the time label it belongs to The data generation time of the historical credit information data.

The span (granularity) of the time label can be a year, a quarter, a month, or even a day. Generally speaking, the shorter the span of the time label (the finer the granularity), the more historical credit information with the same time label is collected. The more uniform the data distribution in the dataset, the better the scoring effect of the trained historical credit scoring model.

In practical applications, the span of the time label can be selected according to specific needs. In the embodiment shown in Figure 9, the time label is year, the current year is 2020, and the credit data of the past three years is collected and sampled to obtain three sets of historical credit data sets, respectively 2017 credit data set and 2018 credit data set. Credit reference dataset and 2019 credit reference dataset. For example, all credit data in the 2017 credit data set were generated in 2017, and each piece of credit data represents the credit status of a credit subject in 2017.

Three sets of historical credit data sets are used as training samples to train the credit scoring model respectively, that is, three historical credit scoring models can be obtained correspondingly, namely the historical credit scoring model in 2017, the historical credit scoring model in 2018 and the historical credit scoring model in 2019. Evaluate the model. Of course, in practical applications, more years of credit data can be collected to train more historical credit scoring models.

specific:

When the credit scoring model is a weighted scoring model, such as a logistic regression model.

The specific implementation process of step S200 is as follows:

The logistic regression expression can be solved iteratively by adopting the classical gradient descent idea, and the training speed is usually very fast. The results obtained by logistic regression can be easily converted into a standard scorecard model, that is, the final total credit score can be split to obtain the dimensional credit score corresponding to each attribute:

(1) When solving the model, first use the method of variable binning to segment the variables;

(2) Then use WOE coding to encode the discrete variables after binning into continuous variables;

(3) After that, the solution training of the model is carried out.

The final result can be expressed as follows,

F(x)=AB(α ₁ x ₁ +...α _n x _n) , α _i =θ _i w _i .

Where A and B are constants, it can be seen that the credit score corresponding to each attribute is -Bθ _i x _i .

When the credit scoring model adopts an unweighted scoring model, such as gradient boosting decision tree (GBDT), Algorithmic models such as deep neural networks.

The specific implementation process of step S200 is as follows:

Using historical credit data, using multiple rounds of iterations, each round of iteration generates a weak classifier, each classifier is trained on the basis of the residual of the previous round of classifiers, and finally the weak classifiers obtained in each round of training are Weighted summation yields an overall classifier.

S300. According to the category of the credit scoring model, predict and obtain several future credit scoring models with time stamps based on several historical credit scoring models with time stamps or several groups of historical credit investigation data sets with time stamps, wherein each The time labels for future credit scoring models are all different.

specific:

As shown in FIG. 2 , the specific implementation process of step S300 is as follows:

S301. Classify and summarize each attribute weight of each historical credit scoring model according to the attribute, so as to obtain several attribute weight sets with time labels.

Still take the credit reporting data in the embodiment of FIG. 8 as an example. After step S200, m historical credit scoring models are trained. As shown in FIG. 9, three historical models are trained. Of course, in practical embodiments, more coarse historical credit scoring models need to be trained.

The historical credit scoring model with time label j can be expressed as:

F(x _j )=α _1j x _1j +...+α _ij x _ij +α _nj x _nj ;

Among them: j is the time label, x _ij is the attribute value of the i-th attribute at the time point corresponding to the time label j, α _ij is the weight of the i-th attribute at the time point corresponding to the time label j, and n is the number of attributes.

Then, the attribute weight set corresponding to the i-th attribute is: (α _i1 , . . . , α _im ).

S302 , using each attribute weight set as a training data set, train to obtain several linear regression models corresponding to several attribute weight sets one-to-one.

That is, corresponding to the ith attribute, perform regression analysis on the attribute weight set (α _i1 , . . . , α _im ) to fit a linear regression model corresponding to the ith attribute. A total of n linear regression models are fitted.

Figure 10 shows the linear regression model trained by taking the attribute "debt ratio" as an example.

S303 , using several of the trained linear regression models, respectively predict the attribute weights with time labels for each attribute at several time points in the future.

After training the attribute weight regression model corresponding to each attribute, the prediction of the attribute weight of each attribute at a certain time point in the future can be realized, so as to obtain the attribute weight with time labels of each attribute at a certain time point in the future.

S304 , constructing a number of future credit scoring models with time tags based on the predicted attribute weights with time tags at several future time points.

Since the credit scoring model adopts a weightless scoring model, in order to obtain a scoring model that can effectively evaluate the credit of the credit reporting subject at future time points. It can make full use of the historical credit data of the credit reporting subject, learn the trend of data changes over time from the historical credit data, and predict a series of credit data corresponding to future time points. specific:

As shown in FIG. 4 , the specific implementation process of step S300 is as follows:

S301', obtain the probability distribution of each historical credit information data set and the parameter value of the probability distribution, and the probability distribution is a Gaussian distribution.

S302', using kernel function operation to transform the probability distribution of each of the historical credit information data sets into a regenerated kernel Hibbert space, to obtain a number of historical vectors with time labels corresponding to each of the historical credit information data sets one-to-one .

S303', using several historical vectors as the training data set, training to obtain a vector regression model;

S304', use the trained vector regression model to predict and obtain several prediction vectors with time labels.

S305', using kernel function operation to inversely transform several prediction vectors into probability distribution space, thereby obtaining several groups of prediction credit data sets with time labels;

S306', using several groups of predicted credit reporting data sets with time labels to train the credit scoring models respectively to obtain several trained future credit scoring models with time labels.

Continuing to refer to Figure 9, after step S300 is performed, three future credit scoring models with time tags are obtained, which are the future credit scoring model in 2020, the future credit scoring model in 2021, and the future credit scoring model in 2022.

S400. Input the credit investigation data to be evaluated into the selected historical credit scoring model or future credit scoring model with a time stamp, so as to obtain the time corresponding to the time stamp of the credit investigation subject corresponding to the credit investigation data to be evaluated Points of credit evaluation results.

As shown in Figure 9, the current time is 2020. If you want to evaluate the credit reporting status of the credit reporting subject in 2018, input the credit reporting data of the credit reporting subject into the 2018 historical credit scoring model to obtain the credit report. The credit evaluation results of the credit subject in 2018. If you want to evaluate the credit information of the credit subject in 2022, you can input the credit data of the credit subject into the 2022 future credit scoring model to obtain the credit evaluation result of the credit subject in 2022.

S500. Explain the credit evaluation result.

As shown in Figure 3, the specific implementation process of S500 is as follows:

S501. Obtain the weight of each attribute from the credit evaluation result and calculate the total weight.

S502. Calculate the weight ratio of the weight of each attribute to the total weight.

S503. Rank the importance of each attribute according to the weight ratio.

S504, evenly divide the sorted attributes into several intervals.

For a weighted scoring model, the higher the weight, the greater the influence of the corresponding attribute on the evaluation result, that is to say, the greater the influence of the attribute on the credit of the credit subject. Therefore, the credit reporting subject can focus on the attributes in the top-ranked interval, and improve the credit score of the credit reporting subject by improving the attribute values of these attributes.

Due to the extreme weights of some attributes, the ranking results of attributes cannot objectively and truly reflect the importance of attributes.

In view of this, optionally, the historical score of each attribute may also be considered.

As shown in Figure 3, optionally, the S500 further includes:

S505. Calculate the score distribution of each attribute from the historical credit information data.

Specifically, for a certain attribute, the proportion of the number of credit reporting subjects in each score segment can be counted. The proportion of the number of people referred to here should be superimposed according to the score from low to high. The actual meaning is the number of people whose score is greater than or equal to a certain score segment. .

S506. Calculate the score ratio of each attribute.

Specifically, for a certain attribute, the score segment in which the credit subject's score is located can be obtained, and then the proportion of the credit subject's score for that attribute can be known from the proportion of the number of people in the score segment;

S507. Reorder the attributes in each of the intervals based on the score ratio of each attribute.

Attributes in the same interval are sorted again according to the score ratio. The lower the score ratio, the higher the ranking. The final ranking result is presented to the user, and the weight ratio and score ratio will also be displayed together.

The credit reporting subject can weigh and select the importance of attributes to the evaluation results in combination with the weight ratio and score ratio to improve their credit.

Since the credit scoring model adopts an unweighted scoring model, in order to interpret the evaluation results, it is necessary to combine the credit scoring model and the attribute data of the credit reporting subject for training to obtain a partial weighted scoring model. Specifically, the local interpretable diagnosis algorithm (LIME) can be used to obtain a local weighted scoring model, and the local interpretable diagnosis algorithm (LIME) can theoretically interpret the evaluation results of any unweighted scoring model.

As shown in Figure 5 and Figure 11, the interpretation of the credit evaluation results using the locally interpretable diagnosis algorithm includes:

S501', stirring the attributes of the credit data to be evaluated to obtain an approximate sample set composed of several sample data similar to the credit data to be evaluated.

Still take the credit reporting data in the embodiment of FIG. 8 as an example. As shown in Figure 12, by perturbing the attribute values of each attribute of the credit data (income and debt ratio in the figure), several sample data similar to the credit data to be evaluated can be obtained, and finally an approximate sample set can be formed. .

S502', input the approximate sample set into the historical credit scoring model or the future credit scoring model that generates the evaluation result, and obtain the sample evaluation result set.

S503', a local weighted scoring model is obtained by training based on the approximate sample set and the sample evaluation result set.

S504', the evaluation result can be interpreted based on the attribute weight of the local weighted scoring model.

It can be seen that the personal credit evaluation method of the present invention constructs a series of credit scoring models over time for multiple historical time points and multiple future time points. Choosing an appropriate credit scoring model can realize the evaluation of the credit status of the credit reporting subject at a certain point in time, thereby significantly improving the evaluation effect of the evaluation model and ensuring the interpretability of the evaluation results.

Device embodiment

As shown in FIG. 6 , the personal credit evaluation device based on time series attribution analysis in this embodiment includes a model initialization module 10 , a historical credit score model acquisition module 20 , a future credit score model acquisition module 30 , a credit evaluation module 40 and an interpretation module 50. in:

The model initialization module 10 is used for constructing a credit scoring model and initializing model parameters, and the credit scoring model is a weighted scoring model or an unweighted scoring model.

The historical credit scoring model acquisition module 20 is used to separately train the credit scoring model by using several groups of historical credit reporting data sets with time tags to obtain several trained historical credit scoring models with time tags, wherein : Each of the historical credit data sets includes multiple pieces of historical credit data, the historical credit data in the same group has the same time label, the historical credit data in different groups has different time labels, the time label Indicates the data generation time of the historical credit data to which it belongs.

The future credit scoring model acquisition module 30 is configured to predict and obtain the data based on the several time-tagged historical credit scoring models or the several groups of time-tagged historical credit investigation datasets according to the category of the credit scoring model. Several future credit scoring models with time stamps, wherein the time stamps of each of the future credit scoring models are different;

The credit evaluation module 40 is used for inputting the credit data to be evaluated into the selected historical credit scoring model or future credit scoring model with a time stamp, so as to obtain the credit reporting subject corresponding to the credit data to be evaluated in the corresponding credit reporting data. The credit evaluation result of the time point corresponding to the time label;

The interpretation module 50 is used for interpreting the credit evaluation result.

Since the processing process of each functional module of the personal credit evaluation device in this embodiment is consistent with the processing process of the personal credit evaluation method in the first embodiment, the processing of each functional module of the personal credit evaluation device is no longer performed in this embodiment. The process is repeatedly described, and reference may be made to the related description of Embodiment 1.

Electronic device embodiment

FIG. 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present application. As shown in FIG. 7 , the electronic device 60 includes a processor 61 and a memory 63 , and the processor 61 and the memory 63 are connected, for example, through a bus 63 .

The processor 61 may be a CPU, general purpose processor, DSP, ASIC, FPGA or other programmable device, transistor logic device, hardware component or any other combination. It may implement or execute the various exemplary logical blocks, modules and circuits described in connection with this disclosure. The processor 61 may also be a combination for realizing computing functions, for example, including a combination of one or more microprocessors, a combination of a DSP and a microprocessor, and the like.

The bus 62 may include a path to transfer information between the components described above. The bus 62 may be a PCI bus, an EISA bus, or the like. The bus 62 can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is shown in the figure, but it does not mean that there is only one bus or one type of bus.

The memory 63 may be ROM or other types of static storage devices that can store static information and instructions, RAM or other types of dynamic storage devices that can store information and instructions, or EEPROM, CD-ROM, or other optical disk storage, optical disk storage. , a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store the desired program code in the form of instructions or data structures and that can be accessed by a computer, without limitation.

The memory 63 is used to store the application code of the solution of the present application, and is controlled and executed by the processor 61 . The processor 61 is configured to execute the application program code stored in the memory 63 to implement the personal credit evaluation method of the first embodiment.

The embodiment of the present application finally provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the program is executed by the processor, the personal credit evaluation method in the first embodiment is implemented.

The invention has been described above in sufficient detail with certain particularities. Those of ordinary skill in the art should understand that the descriptions in the embodiments are only exemplary, and all changes made without departing from the true spirit and scope of the present invention should belong to the protection scope of the present invention. The claimed scope of the present invention is defined by the claims, rather than by the above description in the embodiments.

Claims

A personal credit evaluation and interpretation method based on time series attribution analysis, characterized in that it includes:

Building a credit scoring model and initializing model parameters, the credit scoring model is a weighted scoring model or an unweighted scoring model;

The credit scoring models are separately trained by using several groups of time-labeled historical credit scoring models to obtain several trained time-labeled historical credit scoring models, wherein: each of the historical credit reporting data sets is Including multiple pieces of historical credit data, the historical credit data located in the same group has the same time label, and the historical credit data located in different groups has different time labels, and the time label represents the data of the historical credit data to which it belongs generation time;

According to the category of the credit scoring model, predict and obtain several future credit scoring models with time stamps based on the several historical credit scoring models with time stamps or the sets of historical credit investigation data sets with time stamps , wherein the time labels of each of the future credit scoring models are different;

Input the credit data to be evaluated into the selected historical credit scoring model or future credit scoring model with a time stamp to obtain the credit data of the credit subject corresponding to the credit data to be evaluated at the time point corresponding to the time stamp. credit evaluation results;

Explain the results of the credit evaluation.
The personal credit evaluation and interpretation method as claimed in claim 1, characterized in that:

When the credit scoring model is a weighted scoring model, a number of future credit scoring models with time stamps are predicted and obtained based on the several time stamped historical credit scoring models, including:

Classify and summarize the attribute weights of each of the historical credit scoring models according to attributes, so as to obtain several attribute weight sets with time labels;

Using each of the attribute weight sets as a training data set, training obtains several linear regression models corresponding to the several attribute weight sets one-to-one;

Using several of the trained linear regression models, respectively predict the attribute weights with time labels for each attribute at several time points in the future;

The several time-tagged future credit scoring models are constructed based on the predicted time-tagged attribute weights of each attribute at several future time points.
The personal credit evaluation and interpretation method according to claim 2, wherein the interpreting the evaluation result comprises:

Obtain the weight of each attribute from the credit evaluation result and calculate the total weight;

Calculate the weight ratio of the weight of each attribute to the total weight;

Rank the importance of each attribute according to the weight ratio;

The sorted attributes are evenly divided into several intervals.
The personal credit evaluation method according to claim 3, wherein the interpreting the evaluation result further comprises:

Calculate the score distribution of each attribute from historical credit data;

Calculate the proportion of scores for each attribute;

Attributes in each of the intervals are reordered based on the score ratio of each attribute.
The personal credit evaluation and interpretation method as claimed in claim 1, characterized in that:

When the credit scoring model is an unweighted scoring model, several time-labeled future credit scoring models are predicted and obtained based on the several groups of time-labeled historical credit reporting data sets, including:

Obtain the probability distribution of each of the historical credit information data sets and the parameter values of the probability distribution, where the probability distribution is a Gaussian distribution;

Transform the probability distribution of each of the historical credit information data sets into a regenerated kernel Hibbert space using kernel function operations, and obtain a number of historical vectors with time labels corresponding to each of the historical credit information data sets one-to-one;

Using the several historical vectors as a training data set, training obtains a vector regression model;

Use the trained vector regression model to predict and obtain several prediction vectors with time labels;

using kernel function operation to inversely transform the several prediction vectors into probability distribution space, so as to obtain several groups of time-labeled prediction credit information data sets;

The credit scoring models are separately trained by using the several groups of time-labeled predictive credit reporting data sets to obtain several trained future credit scoring models with time-labels.
The personal credit evaluation and interpretation method according to claim 5, wherein the evaluation result is explained by using a local interpretable model diagnosis method, comprising:

Stir the attributes of the credit data to be evaluated to obtain an approximate sample set consisting of several sample data similar to the credit data to be evaluated;

Inputting the approximate sample set into a historical credit scoring model or a future credit scoring model that generates the evaluation result, to obtain a sample evaluation result set;

A local weighted scoring model is obtained by training based on the approximate sample set and the sample evaluation result set;

The evaluation results are interpreted based on the attribute weights of the local weighted scoring model.
The personal credit evaluation and interpretation method as claimed in claim 1, characterized in that:

The weighted scoring model includes a logistic regression model;

The weightless scoring model includes a gradient boosting decision tree model and a neural network model.
A personal credit evaluation and interpretation device based on time series attribution analysis, characterized in that it includes:

A model initialization module for constructing a credit scoring model and initializing model parameters, where the credit scoring model is a weighted scoring model or an unweighted scoring model;

The historical credit scoring model acquisition module is used to separately train the credit scoring model by using several groups of historical credit reporting data sets with time tags to obtain several trained historical credit scoring models with time tags, wherein: Each of the historical credit investigation data sets includes multiple pieces of historical credit investigation data, the historical credit investigation data in the same group has the same time tag, and the historical credit investigation data in different groups has different time tags, and the time tags represent The data generation time of the historical credit data to which it belongs;

The future credit scoring model acquisition module is used to predict and obtain a number of historical credit scoring models based on the several time-tagged historical credit scoring models or the several groups of time-tagged historical credit reporting data sets according to the category of the credit scoring model. A future credit scoring model with a time stamp, wherein the time stamps of each of the future credit scoring models are different;

The credit evaluation module is used to input the credit data to be evaluated into the selected historical credit scoring model or future credit scoring model with a time stamp, so as to obtain the credit reporting subject corresponding to the credit data to be evaluated at the time The credit evaluation result at the time point corresponding to the label;

The interpretation module is used to interpret the credit evaluation result.
An electronic device, comprising a memory, a processor and a computer program stored in the memory and running on the processor, characterized in that, when the processor executes the program, any one of claims 1 to 7 is implemented A method of personal credit evaluation and interpretation based on time series attribution analysis.
A computer-readable storage medium, characterized in that, a computer program is stored on the computer-readable storage medium, and when the program is executed by a processor, the time-series attribution analysis-based method according to any one of claims 1-7 is implemented. Personal credit assessment and interpretation methods.