CN112434790A

CN112434790A - Self-interpretation method for convolutional neural network to judge partial black box problem

Info

Publication number: CN112434790A
Application number: CN202011249200.2A
Authority: CN
Inventors: 赵金伟; 王启舟; 邱万力; 黑新宏; 答龙超; 王伟; 谢国; 胡潇
Original assignee: Xian University of Technology
Current assignee: Xian University of Technology
Priority date: 2020-11-10
Filing date: 2020-11-10
Publication date: 2021-03-02
Anticipated expiration: 2040-11-10
Also published as: CN112434790B

Abstract

The invention discloses a self-interpretation method for judging partial black box problems of a convolutional neural network, which is implemented according to the following steps: step 1, providing an interpretable distance as an interpretable index of a measurement model; step 2, forming a feature extractor by the convolution layer and the pooling layer of the CNN network; step 3, taking the output of the feature extractor obtained in the step 2, namely the feature graph, as the input of the CNN network full connection layer; step 4, in the full connection layer, classifying and labeling the input image samples in the CNN network by using the characteristic diagram in the step 3; step 5, using the full connection layer to form the discrimination part of the CNN network; step 6, constructing a DCLM model of three layers of nodes; and 7, providing a novel game method to carry out game training of the DCLM model and the CNN network constructed in the step 6 so as to improve the interpretability of the CNN network discrimination part. The invention solves the problem of large error of the interpretation method in the prior art.

Description

Self-interpretation method for convolutional neural network to judge partial black box problem

Technical Field

The invention belongs to the technical field of deep learning in the field of computers, and particularly relates to a self-interpretation method for judging partial black box problems of a convolutional neural network.

Background

In recent years, convolutional neural networks (hereinafter CNN), as a representative algorithm for deep learning, have been shown to significantly surpass the capability of human beings in some specific tasks such as computer vision, computer games, and the like. However, the way that neural networks deal with problems is difficult to understand and interpret by people, and due to the ubiquitous difficulty in interpretation, many problems in fairness, privacy, robustness and credibility are caused in application; many researchers are currently trying to explain in different ways around the black box problem of CNN.

The so-called black box problem of CNN is an unexplainable problem whose main difficulty is in its discrimination part, and there are two types of main methods for dealing with such problems: a pre-method and a post-method (Holzinger et al, 2019); since the prior method is generally a transparent modeling method (arietaa et al, 2020), which is difficult to obtain an explanation about a discriminant part, the common technique mainly focuses on the second posterior method.

Early post-hoc approaches obtained a global interpretation of the neural network, mainly by extracting a predictable model. (Craven & Shavlik, 1999; Krishnan et al, 1999; Botz, 2002; Johansson & Niklason, 2009) proposed a method to find a decision tree for interpreting neural networks by maximizing the gain ratio and estimating the fidelity of the current model;

(Craven & Shavlik, 1994; Johansson & Niklasson, 2003; Augasta & Kathirvakumar, 2012; Sebastian et al, 2015; Zilke et al, 2016) proposed a specific rule extraction method to search for the best interpretable rule from a neural network;

in recent years, some feature association methods have been produced, such as a network classification decision decomposition method based on deep taylor decomposition proposed by Montavon et al (Montavon et al, 2017); shrikumar et al (Shrikumar et al, 2016) propose a DeepLIFT algorithm for calculating a multi-layer neural network importance score by interpreting the differences in the outputs of certain reference outputs by their difference in their reference inputs;

in addition, Che et al propose a simple rectification method called interpretable simulation learning, which extracts interpretable simple models by using gradient enhanced trees, and Thiagarajan et al establish tree-like view representations of complex models by hierarchical division of feature spaces to improve interpretability; still other scholars have proposed a rectification method for transforming knowledge from a model set to a single model, Wu et al have proposed a tree regularization method based on knowledge rectification to represent a neural network output feature space based on multi-layer perception;

however, these methods can only solve the unexplained problems of the trained neural network or the trained deep neural network with explicit input features; moreover, due to the existence of data deviation and noise data in the training data set, the traditional machine learning method is difficult to ensure the essential condition that the interpretable performance and the generalization performance of the identification part are consistent and converged; besides, the interpretable performance is a distance between the discriminant part and its optimal interpretable model, and when the discriminant part has better generalization performance, it tends to deviate from its optimal interpretable model, thereby generating an incorrect interpretable model.

Disclosure of Invention

The invention aims to provide a self-interpretation method for the problem of judging partial black boxes of a convolutional neural network, and the self-interpretation method solves the problem of overlarge error of an interpretation method in the prior art.

The invention adopts the technical scheme that a self-explanation method for judging partial black box problems of a convolutional neural network is implemented according to the following steps:

step 1, providing an interpretable distance as an interpretable index of a measurement model;

step 2, forming a feature extractor by the convolution layer and the pooling layer of the CNN network;

step 3, taking the output of the feature extractor obtained in the step 2, namely the feature graph, as the input of the CNN network full connection layer;

step 4, in the full connection layer, classifying and labeling the input image samples in the CNN network by using the characteristic diagram in the step 3;

step 5, using the full connection layer to form the discrimination part of the CNN network;

step 6, constructing a DCLM model of three layers of nodes;

and 7, providing a novel game method to carry out game training of the DCLM model and the CNN network constructed in the step 6 so as to improve the interpretability of the CNN network discrimination part.

The present invention is also characterized in that,

in the step 1, under the same input data set, the similarity is measured by the variance of the difference between the output of the original CNN network model and the output of the constructed DCLM model, and the measurement result is initially named as an interpretation distance, and the shape of the judging part of the CNN network is closer to the shape of the optimal interpretable model when the interpretation distance is smaller, so that the interpretable performance of the interpreting part is better;

assuming that Q is a compact metric space, Z is a sample set, v is a Borel metric in the metric space Q, such as a Lebesgue metric or an edge metric, in a squared integrable function space to the metric space Q

In (f), the interpretation distance phi between the discriminating part f (x) and the model P (x) of the best interpretation thereof_d(P x, f) is expressed as:

wherein

The step 6 is as follows:

the three-layer network structure of the DCLM is as follows: the characteristic predicate layer as a first layer is composed of characteristic predicate nodes, wherein each characteristic predicate node has a characteristic predicate function Z (Γ) used for representing whether a neuron of a discriminant part of a first fully-connected layer in the CNN network has the capability of capturing characteristics, and the characteristic predicate function Z (Γ) is represented as follows:

where i ∈ {1, 2,..., k }, τ_iIs the ith feature map in Γ, and wi is its corresponding weight vector in the first fully-connected layer, called the feature capture layer, where x represents the convolution.

In the step 6, the decision result predicate layer is the bottommost layer, and each result predicate node has a result predicate function which represents whether the output neuron of the judgment part is greater than 0; the corresponding function is expressed as follows:

all the nodes of the characteristic predicate layer and the result predicate layer in the step 6 are connected with one or more disjunct nodes of the middle layer, wherein the disjunct layers are called disjunct layers and represent true or false true value conditions; each disjunct represents a disjunct relationship of all the characteristic predicate layer nodes and the result predicate layer nodes, and the disjunct relationship is characterized by a disjunct normal form; if a predicate layer node is connected with a disjunct layer node by a boundary of true or false, and the predicate function of the predicate layer node is followed by a non-operation in the disjunct operation, a potential function of a disjunct normal form is obtained by a Lukasiewicz method as follows:

φ_c(y)＝min(1，T(Γ，y)) (3)

in the formula:

wherein N is the number of nodes of the feature predicate layer; gamma is characteristic diagram, Z_j() Is a feature predicate layer function, and D () is a result layer function; if logical network parameter a_jIf the logic network parameter a is equal to 1, the boundary between the predicate node and the disjunct node is false_jA value of 0 indicates a true boundary.

In step 6, the DCLM model is set to include a disjunctive relationship:

is equal to

Wherein Z_j() Is a feature predicate layer function, and j is an element (1, 4), and y is the neuron output of the CNN network; the potential function of this disjunctive normal form is:

wherein, a₁＝a₄1 and a₂＝a₃＝0；

The group truth conditional probability distribution of the DCLM is as follows:

where G is the number of all ground functions, the scoring function xi in the formula denotes as follows:

output value y of neuron output layer of CNN network_dclmComprises the following steps:

y_dclm＝(y_dclm，1，y_dclm，2，...，y_dclm，G) (7)

wherein λ is_iFor the weight of the ith group function, by maximizing the likelihood function, the optimal output value in the DCLM model can be obtained: y is_dclmLogical network parameter a_jAnd λ_i(ii) a The process of maximizing the likelihood function is shown as follows:

in the formula, phi_ciAs a function of potential, a_iIs a logical network parameter;

for the DCLM, firstly, a logic network is constructed, and then a maximum a posteriori MAP algorithm is adopted for model extraction, which specifically comprises the following steps:

inputting a feature map, a feature capture matrix, a label corresponding to input and an input number;

initializing initial values of a true pseudo edge set and y-dclm, and setting the number of initialization iterations to be 1; entering an outer loop body to perform the following operations on all characteristic graphs:

solving cosine similarity by using the current characteristic diagram and a weight matrix corresponding to the current characteristic diagram, storing the cosine similarity, and calculating and storing a product of a modulus of the current characteristic diagram and the weight matrix;

traversing each dimension of the output vector, solving the latest edge value in the analytic expression corresponding to the current dimension by using three parameters of a similarity result obtained from the upper part of the output vector, a product of models and corresponding output y-dclm, solving a union set by using an existing edge value set and the currently solved edge value, and increasing the iteration count and adding 1;

and when the iteration times are equal to the initial input number, the algorithm ending condition is met.

Step 7, the game iteration optimization idea is as follows:

inputting: inputting a data set and a target output;

initialization: the logic network is named as LN and CNN network is named as CN;

firstly: putting the sample in CN to obtain a characteristic diagram FM and the output of CN, obtaining the output of the logic network in a logic network model LN by using the characteristic diagram FM, calculating a Loss function Loss by using the output of the logic network, the output of the label and the output of CN, and updating CN;

secondly, substituting the updated CN into the sample to obtain a new characteristic diagram, a weight matrix corresponding to the full connection layer and the characteristic diagram and the output of the CN, and continuously updating and constructing a logic network LN;

forming a process that the CN is continuously corrected and the CN constructs the logic network; and circulating in such a way until the end condition is met and ending.

Step 7 is specifically as follows:

from the measurement of interpretation distance, when the discriminating portion of CNN is sufficiently similar to the shape of its optimal interpretable model, the discriminating portion has good interpretable performance, but its generalization performance tends to be degraded, which is mainly due to the fact that two sufficient conditions for consistent convergence of performance are difficult to guarantee, as shown in the following formula:

φ_d(P, f) ═ 0, f (x) is the CNN discriminant, P (x) is the best-explained model;

since this sufficient requirement is difficult to satisfy, the equalization problem always exists, and its optimal interpretable model P and its optimized discriminative part are unknown, it may be a feasible solution to automatically extract one interpretable model from the discriminative part and then iteratively reduce the interpretation distance between the two models in the training process when considering the equalization problem, as follows:

to avoid degrading generalization performance, the maximum probability of p (w | X, y) should be guaranteed_t) Where X is a training sample, w is a parameter set for CNN, y_tA target vector of X, one obtains:

wherein the content of the first and second substances,

when DCLM is known, y_dclmIs an optimal solution thereof, and

p(y*d_clm|w，X)＝1 (14)

based on this, we obtain:

likewise, the parameter set w, y of the training sample X, CNN is known as the input value_tFor the target vector of X, fnn is the optimal solution for CNN net, which can be:

p(y_t|w，X)＝p(y_t|f_nn，w，X)p(f_nn|y*_dclm，w，X) (16)

if the parameter set w and the training sample X of CNN are given, and the loss function is

The conditional probability distribution function is then:

simultaneously:

wherein xi 1, xi 2 are scoring functions; by maximizing the likelihood function of p (w | X, yt), the optimal set w of parameters for CNN can be obtained, assuming w obeys a gaussian distribution:

where α is an element parameter determined by the variance of the selected gaussian distribution, this equation is transformed into a minimization problem, which is obtained:

the self-explanation method for the black box problem of the judging part of the convolutional neural network has the advantages that on the premise of not damaging the generalization performance of the convolutional neural network, the causal relationship of the CNN without the prior structure is actively extracted, the explanation is finally obtained through the causal relationship, and meanwhile, the interpretable performance of the judging part of the CNN can be improved. In the process, the invention provides an interpretable model, namely a depth perception learning model (DCLM), to express the causal relationship of the judging part, and provides a greedy algorithm to automatically extract the DCLM from the judging part by solving the problem of the maximum satisfiability of the judging part. A new game method is provided, the distance between the two models is reduced through iteration, the two models are corrected, and the interpretable performance of the judgment part is improved on the premise of not greatly reducing the generalization performance of the judgment part. Meanwhile, the generalization performance of the DCLM is also improved. An interpretable distance is presented for evaluating and measuring the distance between the discriminant and the interpretable model on the unexplainable problem.

Drawings

FIG. 1 is a concrete structural representation of a three-layer network structure of a deep perceptual learning model DCLM;

FIG. 2 is a disjunctive relational representation in the deep perceptual learning model DCLM;

FIG. 3 is a graph of the accuracy of DCLMs and CNN-DCLMs for epochs during a game;

FIG. 4 is an explanatory distance of DCLMs from CNNs and CNN-DCLMs;

FIG. 5 is the information entropy of DCLM;

FIG. 6 shows the accuracy of CNN and CNN-DCLM.

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

The invention relates to a self-explanation method for judging partial black box problems of a convolutional neural network, which is implemented according to the following steps:

wherein

The step 6 is as follows:

as shown in fig. 1, when constructing the DCLM model, since the relationship between the features captured by the weight matrix of the first layer and the output vectors thereof is an internal relationship of the CNN network structure, which is neither related to the input data nor to the abstraction degree of the features, it can be considered that the distinguishing part can be better interpreted by a logical relationship, and the representation of the logical relationship is represented by a DCLM three-layer network structure, which is: the characteristic predicate layer as a first layer is composed of characteristic predicate nodes, wherein each characteristic predicate node has a characteristic predicate function Z (Γ) used for representing whether a neuron of a discriminant part of a first fully-connected layer in the CNN network has the capability of capturing characteristics, and the characteristic predicate function Z (Γ) is represented as follows:

where i ∈ {1, 2,..., k }, τ_iIs the ith characteristic diagram of Γ, w_iIs its corresponding weight vector in the first fully-connected layer, called the feature capture layer, where x represents the convolution.

step 6, constructing a DCLM model of three layers of nodes;

φ_c(y)＝min(1，T(Γ，y)) (3)

in the formula:

As shown in fig. 2, the DCLM model set in step 6 includes an disjunctive relationship:

is equal to

wherein, a₁＝a₄1 and a₂＝a₃＝0；

The group truth conditional probability distribution of the DCLM is as follows:

y_dclm＝(y_dclm，1，y_dclm，2，...，y_dclm，G) (7)

Step 7, the game iteration optimization idea is as follows:

inputting: inputting a data set and a target output;

Step 7 is specifically as follows:

since this sufficient requirement is difficult to satisfy, the trade-off problem always exists, and its optimum interpretable model P and its decision of optimizationWhile not known, to prove this conclusion, one can first focus on only one neuron of the CNN network, because each input channel f (x) of the neuron can be regarded as a kernel function K (x, w), where w is a weight vector containing an offset in the neuron, which can span the Hilbert space, Q is a compact metric space, Z is a set of samples, and v is a Borel metric in the metric space Q, such as the Lebesgue metric or the edge metric, in the squared integrable function space to the metric space Q

The method comprises the following steps:

H_Kis one is arranged at

Linear function of (c), however, due to data bias and noisy data in the training dataset, its optimal interpretable model is in most engineering applications

The above is often a non-linear function because the continuous linear functional is integrated in

Is not dense. Therefore, the sufficient requirements are difficult to satisfy, and the trade-off problem always exists, which also exists in the discriminating part of CNN; however, its optimal interpretable model P and its optimized discriminative part are unknown, so, when considering the trade-off problem, extracting one interpretable model from the discriminative part and then iteratively reducing the interpretation distance between the two models during the training process may be a feasible solution, as follows: the learning algorithm for CNN can improve its performance interpretability if the minimum loss function for CNN can guarantee minimum interpretability performance. If not, generalization performance and interpretability performanceThe trade-off problem between (a) and (b) will no longer exist. To demonstrate the existence of a problem, one may focus on only one neuron of CNN; if one input channel f (x) of a neuron is considered as a kernel function K (x, w) (w is a weight vector containing a bias of the neuron), it will open up a kernel hilbert space for the neuron.

H_KCan be regarded as one in

The linear function set above is the solution space of the neuron. Based on the following introduction

The essential conditions for consistent convergence of generalization performance and interpretable performance are discussed in the publication:

lemma 1, a continuous linear functional set on a separable hilbert space is not dense on a square integrable function space;

lemma 2, a set of continuous non-linear functionals of separable hilbert space is dense everywhere in the L2 theorem.

The discussion is as follows:

at H_KWhen the optimal input channel f is approximate to a linear function, the optimal interpretable model P is in H_KWithout any approximation of a linear function or in the absence, according to theorem 1, the conventional training process cannot guarantee that f (x) approximates P (x). From lemma 2, it can be found that convergence cannot be achieved until P x (x) approximates f x (x). Here, if the approximation is defined as the similarity of the curve shapes of the f (x) and P (x) functions, then the essential condition for the consistent convergence of the two performances is that φ (x) is the_d(P, f) ═ 0. The sufficient requirements are still true for the discriminating part of CNN. However, in most engineering applications, this condition is difficult to guarantee in most cases due to the data bias and noise data in the training data set. There is always a trade-off between the two properties of the discriminating part.

From the above conclusions, to fully solve the trade-offsProblem, should reduce phi_d(P, f) ═ 0(f (x) is the CNN discriminant, and P (x) is the best-explained model. However, P and f are unknown, and therefore, in the training process, an interpretable model P (x) is extracted from the discriminative portion f (x), and then phi is iteratively reduced_d(P, f) is a more convenient method.

wherein the content of the first and second substances,

when DCLM is known, y_dclmIs an optimal solution thereof, and

p(y*_dclm|w，X)＝1 (14)

based on this, we obtain:

p(y_t|w，X)＝p(y_t|f_nn，w，X)p(f_nn|y*_dclm，w，X) (16)

The conditional probability distribution function is then:

simultaneously:

and (3) experimental verification:

experiments were designed in the present invention to verify the effectiveness of the invention. The first experiment verifies whether the self-explanatory method can improve the interpretable performance of CNN without reducing the generalization performance of CNN. The second experiment verifies whether the method can tend to stabilize and converge in the game process.

The structures of CNN3 (including 3 convolutional layers, 3 max pooling layers, 3 full-link layers (FCLs), and 1 output layer), CNN5 (including 5 convolutional layers, 5 max pooling layers, 3 full-link layers, and 1 output layer), and CNN8 (including 8 convolutional layers, 8 max pooling layers, 3 full-link layers, and 1 output layer) were used in the experiment. The final structures obtained by training the CNNs by the method are named as CNN3-DCLM, CNN5-DCLM and CNN8-DCLM respectively. All experiments were performed using the mnst (leconet al., 1998), fashinon mnst (Zalando,2017) and emist (cohenetal., 2017) reference data sets. All algorithms were implemented in Python using a pytorech library (paszkey et al, 2019), and the experiments were run on a server equipped with an Intel Xeon 4110(2.1GHz) processor, 20GBRAM, and nvidia telsa t 4.

TABLE 1 Classification precision and interpretation distance of three game methods on DCLM-CNN

Experiment 1 Performance validation of the methods herein on CNN

Experiments also compared Soft Decision Trees (SDT). (Hinton,2017) is based on trained CNN3, CNN5 and CNN8, namely CNN3-sdt, CNN5-sdt and CNN 8-sdt. These baseline methods also included SDT (frost & (Hinton,2017) based on trained CNN3-dclm, CNN5-dclm and CNN8-dclm, CNN3-SDT, CNN5-SDT and CNN8-SDT, respectively, all algorithms had these indicators on all test datasets as shown in table 1.

As can be seen from table 1, the accuracy of CNN-DCLMs was higher than both the SDT and DCLM interpretable models and about 1.4 percentage points lower than that of CNN on all baseline datasets. It is noteworthy, however, that with the exception of CNN3-SDT on the Emnist dataset, the interpretation distance of all CNN-DCLMs on most datasets was about 5 percentage points lower than that of most SDT from their CNNs, which may prove that the self-interpretation method can improve the interpretability of the resolved portion of CNN without significantly reducing the generalization performance of CNN;

on the Mnist dataset, the accuracy of DCLMs was only 0.7 percentage points lower than that of SDT on CNN5-SDT and CNN8-SDT, except CNN 3-SDT. Through the Emnist data set and the FashinonMnist data set, the accuracy of DCLMs can be found to be 2 percent lower than that of SDT CNN 3-SDT; it can also be found that the accuracy of DCLMs of CNN5-SDT and CNN8-SDT is about 4.3 percent higher than that of SDTs in the two data sets. This is mainly because the feature maps output by the feature extraction parts of CNN5 and CNN8 are more abstract than CNN3 for the Emnist dataset and the fast on mnsist dataset. The abstract feature graph can easily represent the property of the predicate that the DCLM is distortion-free, so it does not hinder or affect the generation of the DCLM;

experiment 2 Convergence test of the proposed method

The convergence of the method is verified through experiments: CNN3, CNN5, CNN8 were compared with CNN3-dclm, CNN5-dclm, CNN8-dclm, respectively. 25 epochs are required for each training and all results are measured at each epoch, as shown in fig. 3, 4, 6, 5. Each graph includes 9 secondary graphs. The three subgraphs in the left column are experiments with the mnst dataset. The three subgraphs in the middle column are the FashinnMnist dataset and the three subgraphs in the right column are the Emnist dataset. FIG. 3 shows the accuracy of DCLMs and CNN-DCLMs for each epoch during the gaming process. From these data it is evident that the accuracy of dclm and of these cnn-dclm are steadily increasing at an early stage. In the next stage, their accuracy tends to be stable. This indicates that the gaming method does not affect the enhancement of dclm and cnn generalization performance;

FIG. 4 shows the interpreted distances of DCLMs from CNNs and CNN-DCLMs. As can be seen from these figures, most of the time, the interpretation distance of CNN-DCLMs not participating in the game is greater than CNN-DCLMs, especially at the end of training. The result shows that the game method can effectively improve the interpretable performance of the CNN-DCLMs.

It can be seen from fig. 5 that the dclm of CNN3, CNN5, CNN8 all have stable entropy values at the end of the game. On the Mnist data set, the entropy converges to 120-145 finally. The entropy values eventually converge between 125 and 136 on the FashinonMnist dataset. On the Emnist dataset, the entropy eventually converges between 100 and 120. The result shows that the game algorithm can ensure that the DCLM converges to a stable state

FIG. 6 shows the accuracy of epochs CNN and CNN-DCLMs during a game; from these plots, it can be seen that the accuracy of CNNs and CNN-DCLMs steadily increased in the early stages, while the accuracy of CNN-DCLMs was lower than CNNs. But in the last stage its accuracy tends to be stable and consistent. The main reason is that at an early stage, CNN-DCLMs must reduce their generalization performance and increase their interpretability performance in terms of the tradeoff between generalization performance and interpretability performance. The proposed gaming method can effectively narrow the gap between the two performances. This indicates that the method is effective for the trade-off problem

From the subgraphs of fig. 3 and 4, it can also be seen that one DCLM model is available on each epoch. After 15 epochs, their interpretation distance tends to converge. This phenomenon shows that the gaming process can reduce the distance, and the training process of the recognition part of the network neural network can be interpreted in real time by a dclm model after 15 epochs thereof.

Claims

1. A self-interpretation method for judging partial black box problems of a convolutional neural network is characterized by comprising the following steps:

step 6, constructing a DCLM model of three layers of nodes;

2. The self-interpretation method for black box problem of the discriminant part of the convolutional neural network as claimed in claim 1, wherein in the step 1, it is assumed that under the same input data set, the similarity is measured by the variance of the difference between the outputs of the original CNN network model and the constructed DCLM model, and the measurement result is primarily named as interpretation distance, the smaller the interpretation distance, the closer the shape of the discriminant part of the CNN network is to the shape of the optimal interpretable model, so the better the interpretable performance of the interpretation part is;

φ_d(P*，f)＝∫_z(f(x)-P*(x)-μ^P*(f))²dv (9)

wherein mu^P*(f)＝∫_z(f(x)-P*(x))dv (10)。

3. The self-interpretation method for the convolutional neural network to discriminate the partial black box problem as claimed in claim 2, wherein the step 6 is as follows:

4. The self-interpretation method for the black box problem of the discriminant part of the convolutional neural network as claimed in claim 3, wherein the decision result predicate layer in step 6 is the bottom layer, and each result predicate node has a result predicate function which indicates whether the output neuron of the discriminant part is greater than 0; the corresponding function is expressed as follows:

5. the self-interpretation method for the convolutional neural network to distinguish the partial black box problem as claimed in claim 4, wherein all the nodes of the feature predicate layer and the result predicate layer in the step 6 are connected with one or more extraction nodes of the middle layer, and the layer is called an extraction layer and represents the true or false true value condition; each disjunct represents a disjunct relationship of all the characteristic predicate layer nodes and the result predicate layer nodes, and the disjunct relationship is characterized by a disjunct normal form; if a predicate layer node is connected with a disjunct layer node by a boundary of true or false, and the predicate function of the predicate layer node is followed by a non-operation in the disjunct operation, a potential function of a disjunct normal form is obtained by a Lukasiewicz method as follows:

φ_c(y)＝min(1，T(Γ，y)) (3)

in the formula:

6. The method as claimed in claim 5, wherein the step 6 of setting the DCLM model comprises an extraction relationship:

is equal to

wherein, a₁＝a₄1 and a₂＝a₃＝0；

The group truth conditional probability distribution of the DCLM is as follows:

y_dclm＝(y_dclm，1，y_dclm，2，...，y_dclm，G) (7)

7. The self-interpretation method for the convolutional neural network discriminant partial black box problem as claimed in claim 6, wherein the step 7 game iterative optimization idea is as follows:

inputting: inputting a data set and a target output;

8. The self-interpretation method for the convolutional neural network to discriminate the partial black box problem as claimed in claim 7, wherein the step 7 is as follows:

since this sufficient requirement is difficult to satisfy, the trade-off problem always exists, and its optimal interpretable model P and its optimized discriminant are unknown, therefore, when considering the trade-off problem, extracting one interpretable model from the discriminant in the training process, and then iteratively reducing the interpretation distance between the two models may be a feasible solution, as follows:

wherein the content of the first and second substances,

p(y_t|w，X)＝∫p(y_t|f，w，X)∫p(f|y_dclm，w，X)p(y_dclm|w，X)dy_dclmdf (13)

when the DCLM is known as such,y*_dclmis an optimal solution thereof, and

p(y*_dclm|w，X)＝1 (14)

based on this, we obtain:

∫p(f|y_dclm，w，X)p(y_dclm|w，X)dy_dclm＝p(f|y*_dclm，w，X) (15)

p(y_t|w，X)＝p(y_t|f_nn，w，X)p(f_nn|y*_dclm，w，X) (16)

The conditional probability distribution function is then:

simultaneously:

where α (alpha) is an element parameter determined by the variance of a selected gaussian distribution, this equation can be converted to a minimization problem: