WO2020119624A1

WO2020119624A1 - Class-sensitive edge detection method based on deep learning

Info

Publication number: WO2020119624A1
Application number: PCT/CN2019/123968
Authority: WO
Inventors: 王磊; 徐成俊; 程俊
Original assignee: 中国科学院深圳先进技术研究院
Priority date: 2018-12-12
Filing date: 2019-12-09
Publication date: 2020-06-18
Also published as: CN109741351A

Abstract

A class-sensitive edge detection method based on deep learning. A deep learning neural network model is used to obtain edge classification detection results of several class targets. A deep supervision method is used for model training. Labels for adaptive scale transformation are used during a training process. A loss function of a reset weight and a general cross-loss entropy function are used intersectingly. By employing the present method, specific target edges may be detected and the obtained edges may be classified at the same time. Compared to existing deep learning edge detection methods, the present invention not only improves the detection accuracy, but also obtains more detailed image edges while requiring little subsequent reprocessing. In addition, functions have been expanded, which may provide better performance guarantee for other tasks which have edges as the basis, such as target segmentation and instance segmentation.

Description

A category-sensitive edge detection method based on deep learning

Technical field

The invention relates to the field of image processing, in particular to the edge detection technology of images.

Background technique

Edge detection is a basic problem in the field of image processing. The research on edge segmentation has always been the basic task of computer vision and the technical foundation of many researches and applications. For example, cell segmentation of medical images, road segmentation in autonomous driving, etc. are all specific applications of edge detection technology.

There are three traditional edge detection algorithms: Sobel operator, Laplacian operator, and Canny operator. These methods have their own applicable situations, and the efficiency and robustness of obtaining edge images are limited. With the rapid development of artificial intelligence, some technologies have gradually proposed the use of deep learning to obtain image edges, such as references [1] and [2]. Although these technologies have improved efficiency and robustness, they have been greatly improved. The edge of is relatively thicker than the real edge, and the detection accuracy still needs to be improved. Moreover, these edge detection methods can only obtain binary edge images, and cannot classify the obtained edges, and further obtain the semantic information of the edges.

references:

[1]S.Xie and Z.Tu.Holistically-nested edge detection.In IJCV.Springer,2017

[2]G.Bertasius,J.Shi,and L.Torresani.DeepEdge:A multiscale bifurcated deep network for top-down contour detection.In IEEE CVPR,pages 4380–4389,2015

Summary of the invention

In view of the above-mentioned defects of the prior art, the present invention provides an edge detection method with high detection accuracy and capable of obtaining category information of edges.

A class-sensitive edge detection method based on deep learning, using a deep learning neural network model to obtain edge classification detection results of several class targets.

Further, the image to be detected is input to the deep learning neural network model. The deep learning neural network model is a CNN convolutional neural network model, which includes a feature extraction step 1, an upsampling step 2, and a feature fusion and classification step 3.

Further, the feature extraction step 1 includes five stages of feature extraction, S1-S5, to obtain edge feature maps of binary classification with different scales.

Further, when training the CNN convolutional neural network model, an adaptive scale transformation label is used to supervise the training.

Further, the up-sampling step 2 is up-sampling the edge features extracted by the feature extraction step 1, the up-sampled image size is consistent with the size of the image to be detected; the feature fusion and classification step 3 is based on Based on the result of the upsampling step 2, feature fusion is performed, and the edge categories are classified at the same time.

Further, the feature fusion and classification methods are:

E={E ₁ ,E ₂ ,E ₃ },

F={F ₁ ,F ₂ ,...,F _n },

F _new ={F ₁ ,E,F ₂ ,E,...,F _n ,E},

Where E represents the set of edge feature images obtained by supervised learning in feature extraction step 1, E _i represents the up-sampling feature of the edge feature obtained in the i-th stage, i=1, 2, 3, and F represents the feature obtained in the sixth stage S6 , F _n means that F has n channels in total, n is the number of categories of the target to be identified, F _new is the new feature obtained after feature fusion, and the sixth-stage feature S6 is the up-sampling of the fifth-stage feature S5.

Further, the loss function of the CNN convolutional neural network model is used as follows: the loss function of the reset weight is used in the supervision of the binary image of the S1-S5 stage; the general category is used in the multi-class edge supervision of the S6 stage Cross loss function; the loss function of the reset weight used in the final feature fusion and classification.

Further, the obtained edges of different categories are represented by different colors.

The beneficial effects of the present invention are: the present invention can detect specific target edges and classify the obtained edges at the same time. Compared with the existing deep learning edge detection method, it not only improves the detection accuracy, but also obtains a more detailed image Edge, and almost no post-processing is required; and the function has been expanded to provide higher performance guarantee for other tasks based on edge, such as target segmentation and instance segmentation.

BRIEF DESCRIPTION

Figure 1 is an algorithm block diagram of the present invention.

Figure 2 is a comparison chart using different loss functions.

detailed description

The algorithm block diagram of the class-sensitive edge detection method based on deep learning is shown in Figure 1. First, input the image to be detected into the CNN convolutional neural network model, which includes feature extraction step 1, up-sampling step 2, and feature fusion and classification step 3. The feature extraction step 1 includes five stages, namely S1 to S5, which can obtain edge feature maps of binary classification with different scales, that is, S1-edge and S2-edge in FIG. 1. The size of the S1 to S4 process is reduced twice, and the S4 size is as large as the S5 size.

The up-sampling step 2 includes the S6 stage, as well as S1-edge-upsample, S2-edge-upsample, and S3-edge-upsample. Among them, S6 is the up-sampling of S5 features, the size of the up-sampling is consistent with the image to be detected. S1-edge-upsample, S2-edge-upsample, and S3-edge-upsample are the up-sampling of S1-edge, S2-edge, and S3-edge, respectively, and the size is also consistent with the image to be detected.

Feature fusion and classification step 3 is to fuse all the up-sampled features and get the final multi-class target edge detection result. Feature fusion and classification are as follows:

E={E ₁ ,E ₂ ,E ₃ }

F={F ₁ ,F ₂ ,...,F _n }

F _new = {F ₁ ,E,F ₂ ,E,…,F _n ,E}

Where E represents the edge feature image set supervised and learned in the middle stage, E _i represents the up-sampling of the edge feature obtained in the i-th stage, i=1, 2, 3. Where F is the feature obtained in the sixth stage, F _n means that there are n channels in total, and n is the number of categories that want to identify the target. F _new is a new feature obtained after feature fusion. Finally, the new feature F _new is obtained by fusion for the final multi-category classification of edges.

In order to better obtain the effect of multi-classification of edge pixels, supervised learning is adopted for CNN convolutional neural network. Supervise training of multi-scale binary edge images in S1-S5 process, that is, only perform edge two-class classification; then perform multi-classification of edge pixels at S6, classification output at S6 stage, which is used for preliminary supervised learning of the model , The number of output channels is consistent with the number of categories to be detected. In the supervised learning process, because the feature scales obtained in each stage are different during feature extraction, such as S1-edge, S2-edge, and S3-edge in Figure 1, the training data label only provides one, so it is proposed An adaptive scale transformation label is used for supervised training of intermediate processes. When performing stage supervision training, the label will be adjusted to the size of the corresponding stage feature map according to the size of the current feature map, and then the loss function calculation will be performed.

For the loss function in deep learning, in the existing algorithms that use deep learning methods for edge detection, only the loss function of reset weight is used. This loss function will make the predicted edge rougher, and needs to be post-processed. Refinement operation. In the present invention, we alternately use the reset weight loss function and the general cross loss entropy function as the loss function of the entire neural network, specifically: the reset weight is used in the supervision of the binary image in the S1-S5 stage The loss function of Classification; the multi-class edge supervision in the Classification of S6 stage uses the general cross loss function; the loss function of the final weight used in the final feature fusion and classification. After many experiments, the results show that the cross-reuse loss function and the general cross loss entropy function can improve the detection accuracy and refine the edges.

The present invention uses two data sets for training and testing. One is the SBD (Semantic Boundary Dataset) dataset. The SBD dataset contains 11355 pictures, of which 8498 are used for training and 2857 for verification. The second is the Cityscapes dataset, which contains 5000 images, of which 2975 are used for training, 500 for verification, and 1525 for testing.

The input image size is 400x400, and the evaluation index used is F-measure. The calculation method is as follows:

among them

The meaning of TP, FP, FN is shown in the table below:

The 20 types of verification indicators on the SBD data set are as follows:

The effect of the present invention is shown in Figure 2, where (a) and (d) are the original pictures, (b) and (e) are the detection results obtained by using only the loss function of reset weights, (c) and (f) Test results obtained by using the method of the present invention. In Fig. 2(c), different categories of areas are marked with different colors, in which chairs and sofas are represented by red and green frames, namely dark and light frames in Fig. 2(c).

Persons of ordinary skill in the field should understand that the above is only specific embodiments of the present invention and is not intended to limit the present invention. Any modification, equivalent replacement, or replacement made within the spirit and principle of the present invention, Improvements, etc., should be included in the protection scope of the present invention.

Claims

A class-sensitive edge detection method based on deep learning is characterized in that a deep learning neural network model is used to obtain edge classification detection results of several class targets.
The class-sensitive edge detection method based on deep learning according to claim 1, wherein the image to be detected is input to the deep learning neural network model, and the deep learning neural network model is a CNN convolutional neural network model, It includes feature extraction step (1), up-sampling step (2), feature fusion and classification step (3).
The class-sensitive edge detection method based on deep learning according to claim 2, wherein the feature extraction step (1) includes five stages of feature extraction (S1-S5) to obtain binary classification at different scales Edge feature map.
The class-sensitive edge detection method based on deep learning according to claim 2 or 3, characterized in that, when training the CNN convolutional neural network model, an adaptive scale transformation label is used to supervise the training.
The class-sensitive edge detection method based on deep learning according to claim 2, wherein the up-sampling step (2) is to up-sample the edge features extracted by the feature extraction step (1). The size of the sampled image is consistent with the size of the image to be detected; the feature fusion and classification step (3) is to perform feature fusion based on the result of the upsampling step (2) and classify the edge category at the same time.
The class-sensitive edge detection method based on deep learning according to claim 5, wherein the feature fusion and classification methods are:

E={E 1 ,E 2 ,E 3 },

F={F 1 ,F 2 ,...,F n },

F new ={F 1 ,E,F 2 ,E,...,F n ,E},

Where E represents the edge feature image set obtained by supervised learning in the feature extraction step (1), E i represents the up-sampling feature of the edge feature obtained at the i-th stage, i=1, 2, 3, and F represents the sixth stage (S6 ) The obtained features, F n means that F has n channels in total, n is the number of categories of the target to be identified, F new is the new feature obtained after feature fusion, and the features of stage 6 (S6) are the features of stage 5 (S5) Upsampling.
The class-sensitive edge detection method based on deep learning according to claim 6, wherein the loss function of the CNN convolutional neural network model is used as follows: the binary image in the feature extraction step (1) In supervised learning, the loss function of reset weight is used; in the sixth stage (S6) multi-class edge supervised learning, the general cross loss function is used; in the feature fusion and classification step (3), the loss function of reset weight is used.
The method for class-sensitive edge detection based on deep learning according to any one of claims 1-7, wherein the obtained different class edges are represented by different colors.