CN114492620A

CN114492620A - Credible multi-view classification method based on evidence deep learning

Info

Publication number: CN114492620A
Application number: CN202210080384.7A
Authority: CN
Inventors: 徐偲; 赵京龙; 赵伟; 管子玉; 詹涛
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2022-01-24
Filing date: 2022-01-24
Publication date: 2022-05-13

Abstract

The invention discloses a credible multi-view classification method based on evidence deep learning, which comprises the following steps: s1, sample definition, wherein N samples are set in a data set, and each sample has V visual angles; s2, estimating the classification uncertainty of the single-view data by using the single-view evidence; s3, fusing the multi-view evidences, and propagating the global information to each view by using a degenerated layer, so that each view can learn the evidence based on the global information; and S4, optimizing the target, and optimizing all parameters in the model by using a gradient descent algorithm. The method not only improves the accuracy of prediction, but also uses the degradation layer designed in the invention to mine complementary information between deep and easily ignored visual angles, thereby outputting uncertainty more conforming to human cognition in prediction.

Description

Credible multi-view classification method based on evidence deep learning

Technical Field

The invention relates to the technical field of deep learning, in particular to a credible multi-view classification method based on evidence deep learning.

Background

The multi-view classification refers to that each sample in the data set contains features of a plurality of different views, for example, the features of a plurality of different views are often considered by doctors when cancer diagnosis is performed. The multi-view classification task is realized by utilizing deep learning, so that the defect that deep information is difficult to mine from multi-view data in the traditional machine learning can be overcome. Most of the past research is focused on how to improve the prediction accuracy, but neglects the reliability of the decision. However, in many high-risk applications, not only the result of the decision obtained by the model, but also the confidence level of the decision obtained is required. For example, in medical diagnosis, the confidence level of the decision is crucial, and a decision with unknown confidence level may be unreliable, thereby misleading the doctor to make a wrong diagnosis and delaying the optimal treatment time of the patient.

In recent years, some methods have appeared to output uncertainty of prediction while predicting, but still cannot effectively mine complementary information between multiple views, and thus cannot output reasonable uncertainty. For example. In a certain medical diagnosis event, according to nuclear magnetic resonance imaging, three categories of health, left-side lesion and right-side lesion can be predicted according to a view angle 1, but the prediction precision is not high due to image blurring; according to the clinical test result, the view angle 2 can only predict two categories of health and lesion without distinguishing whether the lesion is on the left or on the right, but the accuracy is high. If, according to common human knowledge, a lesion is predicted at view 2, then a very large complementary information will be given at view 1. In fact, in the past, since it is impossible to distinguish whether a lesion is on the left or right side, that is, complete category information, when processing the view 2, only high uncertainty can be output, and thus deep complementary information of the two views cannot be mined. Ignoring information from any one perspective in a medical diagnosis increases the likelihood of causing a misdiagnosis.

In the multi-view classification problem, under the condition of ensuring that the accuracy is not reduced, it still has great challenges on how to mine complementary information among views and obtain uncertainty according with human cognition. Therefore, how to provide a credible multi-view classification method based on evidence deep learning is a problem that needs to be solved urgently by those skilled in the art.

Disclosure of Invention

The invention aims to provide a credible multi-view classification method based on evidence deep learning, and aims to solve the problems existing in the background technology by providing a deep learning method which can effectively mine complementarity information among views and obtain reasonable uncertainty under the condition of ensuring that the accuracy is not reduced aiming at the multi-view classification problem.

The credible multi-view classification method based on evidence deep learning comprises the following steps:

s1, sample definition, wherein N samples are set in a data set, and each sample has V visual angles;

s2, estimating the classification uncertainty of the single-view data by using the single-view evidence;

s3, fusing the multi-view evidences, and propagating the global information to each view by using a degenerated layer, so that each view can learn the evidence based on the global information;

and S4, optimizing the target, and optimizing all parameters in the model by using a gradient descent algorithm.

Preferably, in step S1:

characterization vector for nth sample vth view:

wherein D is_vIs its dimension;

considering that the data set contains C categories, each category is represented as a one-hot vector as a true label of a sample, and the true label of the nth sample:

y_n∈{0,1}^C；

the goal is to give a sample, predict its label, and give the uncertainty u of the prediction_n∈[0,1]。

Preferably, in step S2:

for the classification problem of C classes, the dirichlet distribution parameter of the nth sample at the v-th view angle is:

its probability density function is then expressed as:

wherein b (x) represents a probability density function of the polynomial distribution;

T_Cis C-dimensional simplex:

determining parameters of a dirichlet distribution using a deep neural network-based learner

Wherein f is^v(x) A deep neural network-based learner is represented,

an evidence vector representing a v-th perspective of the nth sample;

dirichlet distribution is used to describe evidence distribution, let

Then the probability of prediction as class c for the nth sample and the prediction uncertainty are:

wherein the content of the first and second substances,

representing the intensity of the dirichlet distribution.

Preferably, the evidence value

The larger the likelihood that the corresponding category is a true category; uncertainty when the sum of the evidence values of C classes is larger

The smaller will be.

Preferably, in step S3:

the specific visual angle evidence and the fusion evidence satisfy a degradation relation:

wherein, calculating

Meaning that the corresponding elements of two vectors are multiplied, meaning that the data on both sides are not required to be exactly equal but are expected to be as close as possible, d^v∈R^CRepresenting an intensity degradation vector for scaling the intensity, U, between the perspective-specific evidence and the fused evidence^v∈R^C×CRepresenting a category degradation relationship matrix;

the objective function of the degradation layer can be formulated as:

wherein the content of the first and second substances,

the visual angle is a normalization weight coefficient of the v-th visual angle, the larger the weight is, the higher the reliability of the corresponding visual angle is, and when a sample is input into the model, the visual angle with high reliability plays a positive role in optimizing the model parameters; the view angle with low reliability may influence the updating direction of the parameters, so that the generalization capability of the model is poor, and therefore, the influence of the view angle with low reliability on the updating of the model parameters is weakened by adding the weight coefficient.

Preferably, the degradation loss has the meaning of expressing the difference between the fusion evidence and the characteristic perspective evidence.

Preferably, the degenerating layer includes two degenerating stages of intensity degenerating and category degenerating, the intensity degenerating is that the information intensity of the evidence at the specific view angle is inconsistent with that of the fused evidence, and the category degenerating is that the clustering definition at the specific view angle is inconsistent with that of the fused evidence.

Preferably, in step S4:

according to the dirichlet distribution in S2 and the evidence theory:

true label y of desired specimen_nThe corresponding category has more evidence than other categories, and only the log-likelihood function needs to be minimized:

however, it is not preferable that the larger the proof value is, and it is not desirable in the present invention that the proof values of the plurality of categories are scaled with each other. Therefore, the KL divergence is introduced in the present invention as a regularization term:

wherein the content of the first and second substances,

representing the dirichlet distribution parameters after removing the true category evidence; the KL divergence can on the one hand limit the growth of the overall evidence and on the other hand can force the evidence distribution towards a direction of less information entropy, in other words a greater difference in evidence values between different classes.

Giving the overall optimization objective of the model:

at the time of optimization, setting

t represents the number of iterations, it is expected that the model can learn more sufficient evidence at the beginning of training, and then gradually increase the penalty of the regularization term, setting δ to 0.1 represents the coefficient of the degenerated layer.

Preferably, the

The function stimulates the evidence values of the true category towards the total evidence value.

The invention has the beneficial effects that:

the method not only improves the accuracy of prediction, but also uses the degradation layer designed in the invention to mine complementary information between deep and easily ignored visual angles, thereby outputting uncertainty more conforming to human cognition in prediction.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:

FIG. 1 is a flowchart of an overall depth network of a credible multi-view classification method based on evidence deep learning provided by the invention;

FIG. 2 is a visual angle 1 three-classification sample visualization example diagram of the credible multi-visual angle classification method based on evidence deep learning provided by the invention;

FIG. 3 is a visual angle 2 three-classification sample visualization example diagram of the credible multi-visual angle classification method based on evidence deep learning provided by the invention;

FIG. 4 is a bar graph of evidence value data for the TMC model;

fig. 5 is an evidence value data bar graph of the credible multi-view classification method based on evidence deep learning provided by the invention.

Detailed Description

The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic views, which illustrate the basic structure of the present invention only in a schematic manner, and thus show only the constitution related to the present invention, with reference to fig. 1 to 5:

example 1:

the embodiment provides a deep learning method for solving the multi-view classification problem containing uncertainty prediction by considering the consistency and complementarity of multi-view data, and verifies that the method obviously improves the classification accuracy and confidence.

In step S1:

characterization vector for nth sample vth view:

D_vis its dimension;

y_n∈{0,1}^C；

In step S2:

its probability density function is expressed as:

T_Cis C-dimensional simplex:

Wherein f is^v(x) A deep neural network-based learner is represented,

an evidence vector representing a v-th perspective of the nth sample;

dirichlet distribution is used to describe evidence distribution, let

wherein the content of the first and second substances,

representing the intensity of the dirichlet distribution.

Preferably, the evidence value

The smaller will be.

The model provided by the invention comprises two degradation stages of strength degradation and category degradation, wherein the strength degradation indicates that the information strength of the evidence at the specific visual angle is inconsistent with the information strength of the fused evidence, and the category degradation indicates that the clustering definition of the specific visual angle is inconsistent with the definition of the fused evidence.

Referring to fig. 2-3, in an embodiment, for a three-classification task involving two views, if category 2 and category 3 in view 1 do not have sufficient features to distinguish them, but are able to clearly distinguish category 1 from categories 2 and 3, then view 1 should have useful information, i.e., low uncertainty, to assist in a more clear classification of view 2.

In step S3:

wherein, calculating

taking the sample distribution of view 1 as an example, the expected model can learn the matrix U¹：

Since in View 1, Category 1 is clearly divisible, then

While classes 2 and 3 are completely indistinguishable, therefore class 2 or 3 fused evidence must be spread evenly among the evidence of classes 2 and 3 for perspective 1, and so

The objective function of the degradation layer can be formulated as:

wherein the content of the first and second substances,

the weight coefficient is a normalized weight coefficient of the v-th visual angle, and the higher the weight is, the higher the credibility of the corresponding visual angle is.

When a sample is input into the model, the view angle with high reliability plays a positive role in optimizing the model parameters; the view angle with low reliability may influence the updating direction of the parameters, so that the generalization capability of the model is poor, and therefore, the influence of the view angle with low reliability on the updating of the model parameters is weakened by adding the weight coefficient.

The significance of the degradation loss is to express the difference between the fusion evidence and the characteristic perspective evidence.

In step S4:

according to the dirichlet distribution in S2 and the evidence theory:

however, it is not preferable that the evidence values are larger, and it is not desirable in the present invention that the evidence values of the plurality of categories are mutually scaled. Therefore, the KL divergence is introduced as the regularization term:

wherein the content of the first and second substances,

representing the dirichlet distribution parameters after removing the true category evidence; the KL divergence can limit the growth of the total evidence on one hand and can promote the evidence distribution to trend to a direction with smaller information entropy on the other hand, in other words, the evidence values of different classes are more different;

giving the overall optimization objective of the model:

at the time of optimization, setting

t represents the number of iterations, it is expected that the model can learn more sufficient evidence at the beginning of training, then gradually increasing the penalty of the regularization term, setting δ to 0.1 represents the coefficient of the degenerated layer

After the implementation of the S1-S4, compared with the multi-view classification model provided in the prior art, the method provided by the invention not only can ensure that the accuracy is not lower than that of the conventional method, but also can calculate the uncertainty which is more in line with the practical logic by mining the complementary information among the multi-views.

Referring to fig. 4-5, in order to verify this, the present embodiment designs a test sample, which includes 2 views and 3 categories, the distribution of which is shown in fig. 2, and the best proposed method TMC (the paper "trained Multi-view Classification") and the method proposed in the present invention are used to perform experiments respectively, and the evidence values learned by the recording model are compared.

As can be seen from fig. 4 to 5, at view 1, when the categories 2 and 3 are not easily distinguished but are both clearly distinguishable from the category 1, TMC may have a condition of insufficient evidence mining, that is, information carried by view 1 cannot be fully mined, so that a higher uncertainty of 0.49 is output, which is not ideal, and the method provided by the present invention mines sufficient evidence at both categories 2 and 3 of view 1, which is attributed to mining of information of view 1 by the degraded layer, and further fuses view information, and predicts an accurate and reliable decision uncertainty of 0.22.

The invention has carried on the experiment on two open data sets, hand Written data set is the HandWritten digital picture, the extracted characteristic includes 6 visual angles, divide into 10 categories, Scene15 data set includes 4485 Scene pictures, divide into 15 categories, the invention uses GIST, HOG, LBP method to extract the characteristic of three visual angles to carry on the classification experiment of the multi-view separately, regard accuracy as the index, to each method, each data set carries on 10 times of experiments and gets the mean value, the experimental result is shown as table 1:

TABLE 1

Method	HandWritten	Scene15
			TMC	98.04％	63.85％
The invention	98.75％	66.93％

The embodiment verifies that the method not only improves the accuracy of prediction, but also mines complementary information between deep and easily ignored visual angles by using the degradation layer designed in the invention, thereby outputting uncertainty more conforming to human cognition in prediction.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims

1. A credible multi-view classification method based on evidence deep learning is characterized by comprising the following method steps:

2. The credible multi-perspective classification method based on evidence deep learning of claim 1, wherein in step S1:

characterization vector for nth sample vth view:

wherein D is_vIs its dimension;

y_n∈{0,1}^C；

3. The credible multi-perspective classification method based on evidence deep learning of claim 1, wherein in step S2:

its probability density function is then expressed as:

T_Cis C-dimensional simplex:

Wherein f is^v(x) A deep neural network-based learner is represented,

an evidence vector representing the nth sample vth view;

dirichlet distribution is used to describe evidence distribution, let

wherein the content of the first and second substances,

representing the intensity of the dirichlet distribution.

4. The method of credible multi-view classification based on evidence deep learning of claim 3, characterized in that the evidence value

The smaller will be.

5. The credible multi-perspective classification method based on evidence deep learning of claim 1, wherein in step S3:

where the term "operation" denotes the multiplication of corresponding elements of two vectors, the meaning of ≈ is that it does not require exact equality of the data on both sides but it is desirable that they are as close as possible, d^v∈R^CRepresenting an intensity degradation vector for scaling the intensity, U, between the perspective-specific evidence and the fused evidence^v∈R^C×CRepresenting a category degradation relationship matrix;

the objective function of the degradation layer can be formulated as:

wherein, the first and the second end of the pipe are connected with each other,

the weighting coefficient is normalized at the v-th visual angle, and the greater the weighting, the higher the credibility of the corresponding visual angle is.

6. The credible multi-perspective classification method based on evidence deep learning according to claim 5, wherein the meaning of the degradation loss is to express the difference between the fusion evidence and the characteristic perspective evidence.

7. The credible multi-view classification method based on evidence deep learning of claim 5, wherein the degenerating layer comprises two degenerating stages of intensity degeneration and category degeneration, wherein the intensity degeneration is that the information intensity of evidence at a specific view is inconsistent with that of fused evidence, and the category degeneration is that the clustering definition at a specific view is inconsistent with that of fused evidence.

8. The credible multi-perspective classification method based on evidence deep learning of claim 1, wherein in step S4:

according to the dirichlet distribution in S2 and the evidence theory:

introducing a KL divergence as a regularization term:

wherein the content of the first and second substances,

expressing Dirichlet distribution parameters after removing the real category evidence, the KL divergence can limit the increase of the total evidence on one hand, and can promote the evidence distribution to trend to the direction with smaller information entropy on the other hand;

giving the overall optimization objective of the model:

at the time of optimization, setting

9. The evidence deep learning-based credible multi-view classification method according to claim 1, characterized in that, the method is used for credible multi-view classification based on evidence deep learning