CN114742643A

CN114742643A - Model interpretable method for detecting interaction characteristics in field of financial wind control

Info

Publication number: CN114742643A
Application number: CN202210487256.4A
Authority: CN
Inventors: 苗雨提; 王冠; 杨根科; 褚健
Original assignee: Ningbo Institute Of Artificial Intelligence Shanghai Jiaotong University
Current assignee: Ningbo Institute Of Artificial Intelligence Shanghai Jiaotong University
Priority date: 2022-05-06
Filing date: 2022-05-06
Publication date: 2022-07-12

Abstract

The invention discloses a model interpretable method for detecting interactive features in the field of financial wind control, which relates to the field of graph neural network technology and model interpretable technology, and comprises the following steps: step 1, training an additive model according to the existing training characteristics and training labels; step 2, detecting existing interactive feature pairs by using an interactive feature detection module, reconstructing a GAM model by using the obtained interactive feature pairs, the training labels and the residual errors in the step 1, and adding the additive model and the GAM model to obtain a GAM containing the interactive feature pairs²A model; step 3, according to the existence of high-order interaction, if and only if all low-order interaction existsPrinciple, for the GAM²The model carries out multiple rounds of iteration until the interactive characteristic order of a certain round of iteration is not increased any more, and a GAM is obtainedⁿA model; and 4, realizing visualization and report export.

Description

Model interpretable method for detecting interaction characteristics in field of financial wind control

Technical Field

The invention relates to the field of interpretable graph neural network technology and models, in particular to an interpretable model method for detecting interactive features in the field of financial wind control.

Background

In recent years, a new generation of artificial intelligence technology using Machine Learning (Machine Learning), especially deep Learning (deep Learning), as a landmark is continuously developing towards more advanced, complex and autonomous directions, and a new revolution opportunity is brought to the development of economy and society. The application of AI leads to 'big outbreak of species', increasingly permeates all walks of life and human life in every aspect, and is expected to form new economic and social forms. Meanwhile, scientific and technical ethics are increasingly becoming the 'indispensable option' in the current AI technology development and industrial application, and various circles explore the ethical principles, frameworks and treatment mechanisms of the AI.

One of the core issues of scientific ethics is AI Transparency and interpretability (Transparency and explanability). In 11 months 2021, the United Nations' textbook Organization (United Nations reduced, Scientific and clinical Organization, UNESCO) proposed ten-large AI principles including "transparency and interpretability", i.e., the working mode of the algorithm and the algorithm training data should have transparency and interpretability, through the first global AI ethical agreement "artificial intelligence ethical Recommendation" (Recommendation on the ethics of intellectual interpretation).

While not all AI systems are "black box" algorithms "and are not much more interpretable than non-AI techniques, traditional software or human programs, currently machine learning models, especially deep learning models, tend to be opaque and difficult for humans to understand. Continued advances in AI are expected to bring autonomous systems of perception, learning, decision-making, and action in the future. However, the practical utility of these systems is limited by whether the machine can adequately explain its ideas and actions to human users. The transparency and interpretability of AI systems is critical if users want to understand, trust, and effectively manage new generation artificial intelligence partners. Therefore, in recent years, interpretable AI (XAI) has become an emerging field of AI research, and methods and tools for understanding AI system behaviors are being explored by both academic circles and industrial circles, and in particular, interpretability is particularly critical in the financial and medical fields.

The rapid development of financial technology in the internet enables people to conveniently acquire financial services provided by banks, including funds, money deposit, account transfer and the like. When the loan transaction is involved, the customer can also obtain a certain loan with a lower threshold. Therefore, financial services are convenient, and the generated fraud means is rapidly upgraded and iterated, so that the intellectualization, virtualization and concealment of fraud are improved. From the earliest offline financial fraud mainly to purposeful and organized group financial fraud, the bank needs to continuously update the wind control system in an iterative manner so as to reduce bad account loss.

In order to reduce the bad account loss to the greatest extent, a model with higher precision needs to be introduced, however, the model with higher precision means higher model complexity and poorer interpretability, namely, the more difficult it is to give reasonable explanation when determining a bad customer, in other words, business personnel cannot understand why the model rejects a customer, which is a more unacceptable problem for banks. In view of the above situation, the financial application evaluation standard of artificial intelligence algorithm published by the people's bank in 2021, 3 months, has a definite provision for the interpretability evaluation part. The specified characteristics are required to satisfy the following interpretable requirements:

1. the feature definition should satisfy the relevant business logic and rules.

2. The feature definition should be explicitly recorded in the system.

3. The feature definition should have a record of the index processing procedure of the detailed data Extraction Transformation Loading (ETL).

The specified feature derivation requires the following requirements to be met:

1. the feature derivation should be reasonable. For fund type scenes, only discrete feature intersection is allowed, and complex and non-business-meaning feature intersection is not allowed.

2. Based on the feature derivation of the business, the ETL feature derivation process should be detailed.

3. Based on the characteristic derivation of the algorithm, the process and logic of the characteristic derivation of the algorithm need to be displayed.

Therefore, if the algorithm of the model is not interpretable, the choice of the model will be limited to a large extent.

The existing method is mainly used for explaining the effect of deep learning in the field of image analysis, and generally the interpretation precision is low. For applications with blurred boundaries, such as image recognition, a relatively low interpretation cost may be acceptable. But for security applications even for one byte interpretation bias can lead to severe misinterpretations or errors, and a relatively low interpretation accuracy becomes unacceptable.

In addition, the current interpretable method is greatly limited by factors such as algorithms, model structures, application scenarios, etc., and although the method can be used for interpreting behavioral decisions and prediction results of deep learning models, the interpretable method may not work normally under the following several conditions:

1. if the model is an interactive model, such as a random forest, the model for explaining the interactive modeling is still under study because the current interpretability still cannot be explained in real time.

2. Whether features are correlated or not and the interaction between features greatly increase the difficulty of model interpretation, and not only the significance of the features is considered, but also the influence of the correlation between different features on model decision is evaluated.

3. If the model does not model the causal relationship correctly, the model modeling correctness is not tested in the early period because the interpretable method directly explains the model.

4. If the parameter setting of the interpretation method is incorrect, the parameter setting influences the interpretation result, and the stability of the parameter directly influences the reliability of the interpretability.

Therefore, those skilled in the art are dedicated to develop an interpretable method of a model for detecting interactive features in the financial wind control field, and make up for the shortcomings of interpretable and interactive feature detection schemes related to AI models in the financial field in the prior art.

Disclosure of Invention

In view of the above defects in the prior art, the technical problem to be solved by the present invention is how to optimize the existing Model interpretation algorithm, such as Local interpretation Model-adaptive extensions (LIME), shap (shareyadd extension), park (park window), Local Dependency graph (PDP), etc., so that the existing Model interpretation algorithm can individually provide interpretation results with interactive features according to the relevance of the features, and maintain the precision equivalent to that of the tree Model. An example of one of these interaction features is as follows:

currently, mainstream interpretable algorithms for tabular data in the field of financial wind control include LIME, shield, interpretable accelerator (EBM), and the like. Most of these interpretable algorithms are only capable of providing single feature interpretations and cannot detect interactions therein, whereas EBM uses GAM²The algorithm can detect feature interaction between two pairs, and the comparison graph is shown in fig. 1(LIME) and fig. 2(EBM), wherein the interpretation result of the EBM includes feature1 × feature2 interpretation items, that is, the interaction feature interpretation items in the provided interpretation result.

In order to achieve the above object, the present invention provides an interpretable method of a model for detecting interactive features in the field of financial wind control, which comprises the following steps:

step 1, training an additive model according to existing training characteristics and training labels, wherein a classifier of the additive model is expressed as F (x):

F(x)＝∑f_i(x_i)

wherein F (x) is ∈ [0,1 ]]，x_iFor the training feature or the training label, f_i(x_i) For each x_iContribution to the overall prediction result;

step 2, detecting existing interactive feature pairs by using an interactive feature detection module, using the obtained interactive feature pairs, the training labels and the residual errors in the step 1 to construct a GAM model, and adding the additive model and the GAM model to obtain a GAM containing the interactive feature pairs²Model, said GAM²The classifier of the model is denoted F²(x)：

Wherein, F²(x)∈[0,1]，E_nFor the set of the interactive feature pairs detected by the interactive feature detection module, { i, j } represents any of the interactive feature pairs, f_ijFor the interactive feature pair { i, j }. epsilon.E_nAbout x in_i、x_jThe transformation function of (a);

step 3, according to the principle that high-order interaction exists and only all low-order interaction exists, aiming at the GAM²The model carries out multiple iterations until the interactive characteristic order of a certain iteration is not increased any more, and a GAM is obtainedⁿModel, said GAMⁿThe classifier whose model contains n-order interactions is denoted as Fⁿ(x)：

Wherein, Fⁿ(x)∈[0,1]The highest order of the n-order interaction is n, (x)_i,x_j,...x_n) Representing a set of high-dimensional interactive features;

and 4, realizing visualization and report export.

Further, the interactive feature detection module in the step 2 is based on a graph neural network.

Further, the graph neural network in the interactive feature detection module is L₀-SIGN modelComprises an L₀-edge detection model and a SIGN model.

Further, said L₀-the input to the SIGN model is a graph containing no side information, wherein each of said training features is a node X_nThe interaction between the training features corresponds to each edge E_nThat is, a data sample n is represented by a graph:

G_n(X_n,E_n)

wherein the content of the first and second substances,

(e_n)_ij∈{1,0}

where 1 indicates that there is an edge between { i, j }, and 0 indicates that there is no interaction between { i, j }.

Further, in the prediction process, the L₀Edge detection model F_ep(X_n(ii) a ω) for analyzing the presence of edges in the graph, where ω is F_epIs output as a set of edges E'_n。

Further, said L₀The edge detection model is a separate detection module, using a matrix decomposition method.

Further, the SIGN model is a graph classifier, providing G-based_n(X_n,E′_n) Is then passed through L₀Regularization, limiting the number of pairs of the detected interaction features.

Further, said L₀-the SIGN model interactively models the initial nodes connected by edges, updates the representation of each initial node by summarizing all corresponding modeling results, and finally summarizes all updated initial node representations to obtain a final prediction.

Further, the SIGN model will derive a prediction function f_SUsed for predicting classification results and then enabling the classification results to be matched in the process of fitting parameters by a random gradient descent iterative methodWith said L₀Regularization to obtain a final prediction function f for predicting classification results_LSThe generic form of the SIGN model prediction function is:

wherein the prediction function f_SFor a prediction function without regularization, the prediction function f_LSTo pass through the L₀The prediction function f of the regularization process_SAnd theta is a parameter set in the graph neural network.

Further, the initial input does not require side information,

the model interpretable method for detecting the interactive characteristics in the field of financial wind control, provided by the invention, has the following technical effects:

1. the technical scheme provided by the invention provides an interactive feature detection module based on a graph neural network, which can be embedded into a network to obtain an interactive feature pair with higher reliability, and solves the problem that the interpretation precision of the interpretable field of the current model is not low but effective interactive feature interpretation cannot be obtained;

2. according to the technical scheme provided by the invention, the existing EBM model is improved, and the interactive feature detection module based on the graph neural network is embedded, so that the overfitting problem of the GAM linear model due to simple structure is solved, and the reliability and effectiveness of model interpretation are improved;

3. according to the technical scheme provided by the invention, the EBM model is popularized to identify high-dimensional feature interaction, so that high-precision black box models such as deep learning and the like can be introduced into the field of financial wind control which needs high interpretability.

The conception, specific structure and technical effects of the present invention will be further described in conjunction with the accompanying drawings to fully understand the purpose, characteristics and effects of the present invention.

Drawings

FIG. 1 is a diagram of the interpretation results of a LIME algorithm interpreting a single sample;

FIG. 2 is a diagram of the result of an EBM algorithm interpreting a single sample;

FIG. 3 is an interactive feature detection network L₀-IGN algorithm schematic.

Detailed Description

The technical contents of the preferred embodiments of the present invention will be more clearly and easily understood by referring to the drawings attached to the specification. The present invention may be embodied in many different forms of embodiments and the scope of the invention is not limited to the embodiments set forth herein.

The invention provides a model interpretable method for detecting interactive characteristics in the field of financial wind control, which is a white-box modeling method with the same precision as a tree model and capable of providing personalized explanation for a sample, namely the model can be interpreted, and the model is mainly divided into two parts: interaction feature detection module based on graph neural network and method adopting GAM^NThe model fitting method of (1).

Firstly, an interactive feature detection module based on a graph neural network:

graph Neural Networks (GNNs) can facilitate learning entities and their overall relationships. The prior work utilizes GNNs to perform relationship inference in various fields, and in the embodiment of the invention, the relationship inference is used for the aspect of interactive feature detection. The graph neural network adopted by the embodiment of the invention is L₀-SIGN(L₀Norm Statistical Interaction Graph Neural Network), where each data sample is considered a Graph, the features are nodes, and the interactions between the features are edges. L is₀The SIGN neural network can detect the interactive feature pair with the largest contribution and then use Top-K interactive features as the next GAM^NA part of the model is input.

Compared with the method for respectively calculating all the interactive feature intensities between every two in the EBM model, the interactive feature detection module based on the graph neural network only adopts the detected interactive features as input. Furthermore, EBM captures only the contribution of each individual feature interaction to the prediction, and not the overall contribution of a set of interaction feature pairs.

Second, GAM^NModel fitting of (2):

the EBM model adopts a two-step method to calculate the contribution strength of the interactive characteristics to the whole model, but in the embodiment of the invention, a certain improvement is made on a two-step calculation method of the basic EBM model, and the part for calculating the interactive characteristic strength in the second step does not adopt a Fast Interaction Detection (FAST) algorithm of the EBM model any more, but integrates an interactive characteristic Detection module based on a graph neural network.

Meanwhile, GAM on EBM model²And improving the method, and interactively mapping the 2-order features to higher orders based on the existing theoretical derivation.

Thirdly, a visualization and report export module:

the local interpretable module and the global interpretation module are integrated in the same visual window, and a user can inquire the corresponding prediction result and interpretation result by inputting the ID number corresponding to each piece of data. Meanwhile, the explanation report of each client can be exported by one key through the export function of the visual window, and suspicious points existing in the client image are indicated.

In the field of bank wind control, the technical difficulty and time cost for detecting client fraud risk, credit risk and the like are high at present, a certain deviation exists in the existing model, manual examination needs to deeply explore the original data of each client, abnormal points in the client data are found from the original data, and the examination is very large for the energy and professional degree of practitioners. The technical scheme provided by the invention aims to reduce the manual review cost of practitioners, reduce the possibility of misjudgment of the model and provide certain guidance suggestions for subsequent characteristic engineering.

The embodiment of the invention provides a model interpretable method for detecting interaction characteristics in the field of financial wind control, which comprises the following steps:

step 1, training an additive model according to the existing training characteristics and training labels, wherein a classifier of the additive model is expressed as F (x):

F(x)＝∑f_i(x_i)

wherein F (x) is ∈ [0,1 ]]，x_iFor training features or training labels, f_i(x_i) For each x_iContribution to the overall prediction result;

step 2, detecting existing interactive feature pairs by using an interactive feature detection module, using the obtained interactive feature pairs, training labels and residual errors in the step 1 to construct a GAM model, and adding the additive model and the GAM model to obtain a GAM containing the interactive feature pairs²Model, GAM²The classifier of the model is denoted F²(x)：

Wherein, F²(x)∈[0,1]，E_nFor the set of the interactive feature pairs detected by the interactive feature detection module, { i, j } represents any interactive feature pair therein, f_ijTo be specific to the interactive feature pair { i, j }. epsilon.E_nAbout x in_i、x_jThe transformation function of (a);

step 3, according to the principle that high-order interaction exists and only if all low-order interaction exists, GAM is subjected to²The model carries out multiple iterations until the interactive characteristic order of a certain iteration is not increased any more, and a GAM is obtainedⁿModel, GAMⁿThe classifier whose model contains n-order interactions is denoted as Fⁿ(x)：

the principle of "high-order interactions exist if and only if all low-order interactions thereof exist" is specifically: x is the number of₁x₂x₃There is a subset x if and only if₁x₂,x₂x₃,x₁x₃All exist, and generalize it to high-dimensional space. However, a subset of these is O (2) due to high dimensionalityⁿ) The interaction requirements will be increased every time the high dimension is popularized. For example, expanding the interaction to a 4-dimensional space would require it

The subsets exist, so that the algorithm provided by the invention cannot detect characteristic interaction with high dimension, but ensures that important interaction exists.

And 4, realizing visualization and report export.

And the front end and the back end of the JAVA are used, the interpretation operation result is rewritten based on the EBM part in the Microsoft open source framework Interprete of Python, and an interactive feature detection module of the graph neural network is embedded, so that the function of individually inquiring the interpretation result is finally realized.

Wherein, the interactive feature detection module in the step 2 is based on a graph neural network.

The graph neural network in the interactive feature detection module is L₀-SIGN model, comprising an L₀The edge detection model is combined with a SIGN model, as shown in figure 3.

L₀The input to the SIGN model is a graph containing no side information, where each training feature is a node X_nInteraction between training features corresponds to each edge E_nThat is, a data sample n is represented by a graph:

G_n(X_n,E_n)

wherein the content of the first and second substances,

(_n)_ij∈{1,0}

wherein 1 indicates that an edge exists between {, j }, 0 indicates that no interaction exists between { i, j }, the initial input does not require edge information,

in the prediction process, L₀Edge detection model F_ep(X_n(ii) a ω) is used to analyze whether an edge is present in the graph, where ω is F_epIs output as a set of edges E'_n。

L₀The edge detection model is a separate detection module, using Matrix Factorization (MF).

The SIGN model is a graph classifier, and is provided based on G_n(X_n,E′_n) Through L₀Regularization, which limits the number of pairs of detected interaction features.

L₀-the SIGN model interactively models the initial nodes connected by edges, then updates the representation of each initial node by summarizing all corresponding modeling results, and finally summarizes all updated initial node representations to obtain a final prediction.

The SIGN model will derive a prediction function f_SFor predicting classification results and then using L in the process of fitting parameters by a stochastic gradient descent iterative method₀Regularization to obtain a final prediction function f for predicting classification results_LSThe general form of the SIGN model prediction function is:

wherein the prediction function f_SFor prediction functions that are not regularized, prediction function f_LSIs passing through L₀Regularized prediction function f_SAnd theta is a parameter set in the neural network of the graph.

The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concept. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims

1. A model interpretable method for detecting interactive features in the field of financial wind control, the method comprising the steps of:

F(x)＝∑f_i(x_i)

step 2, detecting existing interactive feature pairs by using an interactive feature detection module, reconstructing a GAM model by using the obtained interactive feature pairs, the training labels and the residual errors in the step 1, and adding the additive model and the GAM model to obtain a GAM containing the interactive feature pairs²Model, said GAM²The classifier of the model is denoted F²(x)：

Wherein, F²(x)∈[0，1]，E_nFor the set of interactive feature pairs detected by the interactive feature detection module, { i, j } represents any of the interactive feature pairs, f_ijFor the interactive feature pair { i, j }. epsilon.E_nAbout x in_i、x_jThe transformation function of (a);

step 3, according to the principle that high-order interaction exists and only all low-order interaction exists, aiming at the GAM²The model carries out multiple rounds of iteration until the interactive characteristic order of a certain round of iteration is not increased any more, and a GAM is obtainedⁿModel, said GAMⁿThe classifier whose model contains n-order interactions is denoted as Fⁿ(x)：

Wherein, Fⁿ(x)∈[0，1]The highest order of the n-order interaction is n, (x)_i，x_j，...x_n) Representing a set of high-dimensional interactive features;

and 4, realizing visualization and report export.

2. The model interpretable method of detecting interactive features in the field of financial wind control according to claim 1, wherein the interactive feature detection module in the step 2 is based on a graph neural network.

3. The model interpretable method of detecting interactive features in the field of financial wind control as claimed in claim 2, wherein the graph neural network in the interactive feature detection module is L₀-SIGN model, comprising an L₀-edge detection model and a SIGN model.

4. The model interpretable method of claim 3 wherein L is₀-the input to the SIGN model is a graph containing no side information, wherein each of said training features is a node X_nThe interaction between the training features corresponds to each edge E_nThat is, a data sample n is represented by a graph:

G_n(X_n，E_n)

wherein the content of the first and second substances,

(e_n)_ij∈{1，0}

5. The model interpretable method of detecting interactive features in the field of financial wind control as claimed in claim 4, wherein L is a function of the prediction process₀Edge detection model F_ep(X_n(ii) a ω) is used to analyze whether an edge is present in the graph, where ω is F_epIs output as a set of edges E'_n。

6. The model interpretable method of detecting interactive features in the field of financial wind control as claimed in claim 5, wherein L is₀The edge detection model is a separate detection module, using a matrix decomposition method.

7. The method as claimed in claim 5, wherein the SIGN model is a graph classifier providing G-based interaction feature_n(X_n，E′_n) Is then passed through L₀Regularization, limiting the number of pairs of the detected interaction features.

8. The model interpretable method of detecting interactive features in the field of financial wind control of claim 7 wherein L is₀-the SIGN model interactively models the initial nodes connected by edges, updates the representation of each initial node by summarizing all corresponding modeling results, and finally summarizes all updated initial node representations to obtain a final prediction.

9. The method as claimed in claim 8, wherein the SIGN model is derived from a prediction function f_sFor predicting the classification result and then using said L in the process of fitting parameters by stochastic gradient descent iterative method₀Regularization to obtain a final prediction function f for predicting classification results_LSThe generic form of the SIGN model prediction function is:

10. The model interpretable method of detecting interactive features in the field of financial windmilling as claimed in claim 4, wherein the initial input does not require side information,