CN113221970A

CN113221970A - Deep convolutional neural network-based improved multi-label semantic segmentation method

Info

Publication number: CN113221970A
Application number: CN202110448155.1A
Authority: CN
Inventors: 吴兴隆
Original assignee: Wuhan Institute of Technology
Current assignee: Wuhan Institute of Technology
Priority date: 2021-04-25
Filing date: 2021-04-25
Publication date: 2021-08-06

Abstract

The invention provides a deep convolutional neural network-based improved multi-label semantic segmentation method, which comprises the following steps of: guiding the training of the deep convolutional neural network on a training data set by using a multi-label loss function; predicting the segmentation result of the samples in the training data set by using the trained deep convolutional neural network; updating the sample weight of the training data set based on the prediction result, and updating the training data set by combining with manual labeling; and repeating the process until the deep convolutional neural network meets the target performance. The improved multi-label semantic segmentation method based on the deep convolutional neural network can systematically improve the quality of the manual labeling data through multiple iterative computations, so that the aim of improving the performance of the segmentation task is fulfilled finally.

Description

Deep convolutional neural network-based improved multi-label semantic segmentation method

Technical Field

The invention belongs to the field of deep learning, and particularly relates to a deep convolutional neural network-based improved multi-label semantic segmentation method.

Background

Artificial intelligence methods based on deep learning (mainly through deep convolutional neural networks) have achieved great success in recent years in applications such as image classification, target detection, and semantic segmentation. These deep convolutional neural network based approaches typically require a large amount of manual annotation data as guidance in the application. Therefore, the amount and correctness of the manual labeling data is of paramount importance, since these labels will be used to guide the iterative computation and performance evaluation of these methods at the same time.

However, the gold standard of the manual labeling data is usually difficult to obtain, and especially for the semantic segmentation task, the contour labeling of the object to be segmented needs to invest a lot of time and energy. Even if such manual annotation data is obtained, it can be easily found by further cross-checking that errors or mistakes in manual annotation are almost inevitable. These imperfect artificial labeling data will seriously affect the accuracy and robustness of the deep convolutional neural network-based method in practical applications. Therefore, in the training process of the deep convolutional neural network, a small amount of gold standard data and a large amount of defective labeling data are generally faced.

In addition, there are often multiple types of objects in the same image. A common single-label semantic segmentation model based on a deep convolutional neural network cannot identify multiple types of targets simultaneously, and the possibility that the segmentation performance is influenced by mutual interference among multiple targets exists.

Therefore, in order to better and faster identify various types of targets from the image, and considering the situation that the manual labeling data cannot meet the requirements in terms of quantity and quality, a new method is needed to solve the problem of multi-target semantic segmentation.

Disclosure of Invention

The invention aims to provide a depth convolution neural network-based improved multi-label semantic segmentation method, and solves the problem of multi-target semantic segmentation.

The technical scheme adopted by the invention is as follows:

a deep convolutional neural network-based improved multi-label semantic segmentation method comprises the following steps:

guiding the training of the deep convolutional neural network on a training data set by using a multi-label loss function;

predicting the segmentation result of the samples in the training data set by using the trained deep convolutional neural network;

updating the sample weight of the training data set based on the prediction result, and updating the training data set by combining with manual labeling;

and repeating the process until the deep convolutional neural network meets the target performance.

Preferably, the multi-tag loss function is a multi-tag loss function based on a Dice-logarithmic function.

Preferably, the multi-tag loss function is:

Loss_total＝-log[DC(Anno₁，Pred₁)]-log[DC(Anno₂，Pred₂)]-log[1-DC(Anno₁，Pred₂)]-log[1-DC(Anno₂，Pred₁)]

in the formula, ano represents an artificial label, Pred represents a prediction result, subscript represents a classification of a segmentation target, and DC represents Dice's coeffient.

Preferably, DC is defined as follows:

preferably, the performance of the deep convolutional neural network is evaluated using a gold standard dataset.

Preferably, after each training, a performance evaluation of the deep convolutional neural network is performed.

Preferably, the updating of the training data set comprises the steps of:

when the segmentation result is predicted, a probability value matrix of a pixel level is obtained;

performing intersection calculation on the matrix and the manual label to determine background information;

comparing a preset threshold value to obtain missing or incomplete target contour information in the manual labeling;

and supplementing the contour information into the manual labeling to obtain a new training data set.

Preferably, the sample weights of the training data set are updated using an adaptive algorithm.

The invention has the beneficial effects that: according to the depth convolution neural network-based promotion type multi-label semantic segmentation method, the quality of manual labeling data can be systematically improved through repeated iterative computation, and therefore the purpose of improving the performance of segmentation tasks is finally achieved.

Furthermore, the learning of the deep convolutional neural network on the training data set is guided based on the multi-label loss function of the Dice-logarithmic function, so that the interference among multi-target segmentation can be reduced as much as possible, and the network training process is accelerated.

Drawings

FIG. 1 is a schematic diagram of a deep convolutional neural network-based lifting multi-label semantic segmentation method.

Detailed Description

The invention will be further described with reference to the accompanying drawings in which:

the invention provides a deep convolutional neural network-based improved multi-label semantic segmentation method, and relates to machine vision, convolutional neural networks, deep learning and artificial intelligence technologies. The method guides the learning of the deep convolutional neural network on a training data set through a multi-label loss function based on a Dice-logarithmic function, reduces the interference between multi-target segmentation as much as possible and accelerates the network training process. And then, predicting and weighting the training samples by means of a deep convolutional neural network which is trained and has good performance, and fusing artificial labeling data to obtain an updated labeling data set. And finally, restarting the training of the deep convolutional neural network based on the new labeled data set, and repeating the processes of predicting the training samples, updating the weights of the training samples, fusing artificial labeling and updating the labeled data set until the deep neural network obtains the target performance on the independent golden standard data set.

The depth convolution neural network-based improved multi-label semantic segmentation method provided by the embodiment of the invention comprises the following steps of:

s1, guiding the training of the deep convolutional neural network on a training data set by using a multi-label loss function;

s2, predicting the segmentation result of the sample in the training data set by using the trained deep convolutional neural network;

s3, updating the sample weight of the training data set based on the prediction result, and updating the training data set by combining with manual labeling;

and S4, repeating the process until the deep convolutional neural network meets the target performance.

In this embodiment, the deep convolutional neural network-based lifting multi-label semantic segmentation method, as shown in fig. 1, includes four modules: (1) a deep convolutional neural network for multi-label semantic segmentation; (2) a multi-label loss function based on the Dice-logarithmic function; (3) a data fusion module for combining the artificial labeling and the prediction result of the deep convolutional neural network; (4) and the self-adaptive lifting algorithm is used for updating the weight of the training sample in the network training iterative process. The deep convolutional neural network is used for completing the learning of a multi-target segmentation task based on a training data set and predicting a segmentation result on the training data set. In addition, the deep convolutional neural network will also evaluate the performance of the network and the effect of boosting after each iteration on a separate gold standard dataset.

The multiple label loss function based on the Dice-logarithmic function is used for guiding the training of the deep convolutional neural network, so that the direct interference of multi-target segmentation is reduced to the minimum, and the network convergence speed in the training process is increased. The loss function is defined as follows:

wherein, Anno represents the manual labeling result, Pred represents the network prediction result, and subscripts 1 and 2 represent two different segmentation targets respectively. The number of subscripts is determined by the total number of classes of segmentation targets. DC is Dice's Coefficient, which is defined as follows:

and the data fusion module is used for combining the network prediction result and the manual marking so as to correct a small amount of errors or mistakes in the manual marking. The fusion algorithm adopted by the module comprises the following steps:

(1) obtaining a deep convolutional neural network with good performance on a training set;

(2) performing multi-target segmentation on the training set data based on the network and obtaining a probability value matrix of a pixel level;

(3) the matrix and the artificial label are subjected to intersection calculation to determine background information

(4) Obtaining missing or incomplete target contour information in the artificial marking by comparing with a preset threshold

(5) The contour information is supplemented to the manual marking to obtain a new training data set.

The self-adaptive boosting algorithm correspondingly gives sample weights to the segmentation performance of the trained deep convolutional neural network on the training samples, and the sample weights are brought into the network training process (namely, the loss function weights of the samples are updated) in the subsequent network training process so as to allow the network to pay more attention to the samples with poor segmentation performance.

After initializing the sample weights of the training data set, the deep convolutional neural network learns under the guidance of a multi-label loss function, then performs fusion operation on the artificial data set based on the expression of the network on the training data set, and meanwhile updates the sample weights of the training data set. The above process will repeat the iterative calculation until the performance of the network reaches the preset threshold on the golden standard data set.

It will be understood by those skilled in the art that the foregoing is merely a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included within the scope of the present invention.

Claims

1. A deep convolutional neural network-based lifting multi-label semantic segmentation method is characterized by comprising the following steps:

2. The deep convolutional neural network-based lifting multi-tag semantic segmentation method of claim 1, wherein the multi-tag loss function is a Dice-logarithmic function-based multi-tag loss function.

3. The deep convolutional neural network-based lifting multi-label semantic segmentation method according to claim 1 or 2, wherein the multi-label loss function is:

4. The deep convolutional neural network-based lifting multi-label semantic segmentation method of claim 3, wherein DC is defined as follows:

5. the advanced multi-label semantic segmentation method based on the deep convolutional neural network as claimed in claim 1, wherein the performance of the deep convolutional neural network is evaluated by using a gold standard data set.

6. The improved multi-label semantic segmentation method based on the deep convolutional neural network as claimed in claim 1 or 5, wherein after each training, performance evaluation of the deep convolutional neural network is performed.

7. The deep convolutional neural network-based lifting multi-label semantic segmentation method of claim 1, wherein the updating of the training data set comprises the following steps:

8. The deep convolutional neural network-based lifting multi-label semantic segmentation method of claim 1, wherein sample weights of a training data set are updated by an adaptive algorithm.