CN111145178A

CN111145178A - High-resolution remote sensing image multi-scale segmentation method

Info

Publication number: CN111145178A
Application number: CN201811310536.8A
Authority: CN
Inventors: 漆进; 张通; 史鹏
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2018-11-06
Filing date: 2018-11-06
Publication date: 2020-05-12

Abstract

The invention belongs to the field of computer vision and remote sensing image processing, and particularly relates to a high-resolution remote sensing image multi-scale segmentation method. The method comprises the following steps: preprocessing a remote sensing image; training a multi-scale segmentation model; and (5) predicting the large-scale remote sensing image. The method can effectively relieve the negative influence caused by the edge effect and the inter-class competition problem, and improve the segmentation precision of the remote sensing image.

Description

High-resolution remote sensing image multi-scale segmentation method

Technical Field

The invention belongs to the fields of deep learning, computer vision and remote sensing image processing, and particularly relates to a high-resolution remote sensing image multi-scale segmentation method.

Background

Remote sensing image segmentation is an important component of digital image analysis and is widely applied to various fields such as homeland monitoring, geographical mapping, urban planning, disaster prevention and reduction and the like for a long time. With the development of deep learning technology, segmentation technologies based on remote sensing images, such as earth surface coverage classification, are also developed to a certain extent.

The segmentation is greatly challenged by some special properties of the remote sensing image, such as overlarge image scale, serious imbalance of categories and the like. When a large-scale high-resolution remote sensing image is segmented, an original image is generally firstly segmented into an image with a moderate scale, and then a prediction result of the segmented image is spliced back to the original size, so that edge effects of unsmooth edges and low accuracy at the spliced part are often brought. In addition, the class competition problem caused by the imbalance of the remote sensing image classes can lead to the serious inhibition of the classes with small area occupation ratio, and the segmentation precision is reduced. The high-resolution remote sensing image multi-scale segmentation method can effectively relieve the problems of edge effect and inter-class competition and improve the segmentation precision of the remote sensing image.

Disclosure of Invention

Aiming at the problems or the defects, and reducing the negative effects caused by the edge effect and the inter-class competition, the invention provides a high-resolution remote sensing image multi-scale segmentation method.

The technical scheme adopted by the invention is as follows:

(1) and cutting, normalizing and data enhancing the high-resolution remote sensing image.

(2) And (3) constructing a neural network, and training a multi-scale segmentation model on the remote sensing image processed in the step (1).

(3) And cutting the high-resolution remote sensing image to be tested, and performing multi-scale prediction by using the segmentation model to generate a segmentation result.

The cutting, normalization and data enhancement processing in the step (1) specifically comprise:

(11) randomly cutting out a picture with a moderate size from the high-resolution remote sensing image, and enhancing data such as up-down turning, left-right turning, right-angle rotation, random contrast, random saturation and the like on the cut picture to expand the diversity of a training sample; and carrying out the same processing on the label picture, and keeping the newly generated label picture and the training sample synchronous.

(12) And (3) normalizing the picture generated in the step (11), namely subtracting the mean value of the corresponding channel from each channel of the picture, and dividing the mean value by the standard deviation of the corresponding channel.

The multi-scale segmentation model training process in the step (2) specifically comprises the following steps:

(21) the segmentation network adopts an encoder-decoder structure, and ResNet101 with a full connection layer removed is used as an encoder. The decoder part is structured in such a way that the output of the fifth block of the ResNet101 is subjected to double upsampling and added with the output of the fourth block, the sum of the output of the fifth block is subjected to double upsampling and added with the output of the third block, the sum of the output of the third block and the output of the second block is subjected to double upsampling and added, and finally, the output of the second block is subjected to upsampling to the network input size. And (3) the original size pictures, the 0.75 time size pictures and the 1.25 time size pictures of the training samples are output after passing through the network and are scaled back to the original size, and then the outputs of the three sizes are spliced together. Assuming that n types are shared in the labeled picture, n binary branches are obtained from the output of the network through n convolution layers, and the ith (i is 0,1,2, …, i < n) branch represents the probability that the current pixel is the ith type.

(22) Training is carried out by using a random gradient descent method, and a composite loss function consisting of cross-entropy and jaccard approximation coefficients is adopted, and the calculation formula is as follows:

cross_entropy＝-∑(y_truelog y_pred+(1-y_true)log(1-y_pred))

loss＝cross_entropy-log(jaccard_approximation)

the multi-scale prediction process in the step (3) specifically includes:

(31) cutting the high-resolution remote sensing image to be tested into three groups according to different sizes, wherein the height of the remote sensing image is h, the width of the remote sensing image is w, and the height of the jth (j is 0,1,2) group size is h_jWidth of w_jFirstly, the remote sensing image is reflected and filled, and the filled remote sensing image is high

Width is

(32) And (3) taking one image A from the remote sensing images filled in the jth group (j is 0), normalizing the image A as described in (12), inputting the image A into a trained segmentation model to obtain n probability maps as described in (21), and splicing the n probability maps into A'. And turning the image A up and down to obtain an image B, turning the image A left and right to obtain an image C, and performing the same operation on the images B and C to respectively obtain probability maps B 'and C'. And (4) solving the average value of the probability graph and A ' obtained by respectively turning B ' and C ' up and down and left and right to obtain the final prediction probability graph.

(33) Traversing the residual j (j ═ 0) th group of images, each performing the operation as described in (32), and splicing the final prediction probability map into high-level

Width is

And cutting the probability map into a probability map with the height h and the width w.

(34) The two groups of remote sensing images with j being 1 and j being 2 are respectively subjected to the operations (32) and (33), all three obtained probability maps are averaged, and the category to which the maximum probability belongs is the category of the pixel.

The invention has the beneficial effects that:

the invention provides a high-resolution remote sensing image multi-scale segmentation method, which is characterized in that a large-scale remote sensing image is cut into a plurality of groups of images with different sizes, a prediction result of the cut image is spliced back to the original size, and then the plurality of groups of results are fused together, so that the edge effect can be effectively relieved. In addition, the multi-classification problem of the pixel points is decomposed into a plurality of two-classification problems, so that the problem of inter-class competition caused by unbalanced classes is effectively solved, and the segmentation precision of the remote sensing image is improved.

Drawings

FIG. 1 is a high resolution remote sensing image to be predicted

FIG. 2 is a segmentation result of a high resolution remote sensing image

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings.

The invention discloses a high-resolution remote sensing image multi-scale segmentation method, which comprises the following specific implementation steps:

cross_entropy＝-∑(y_truelog y_pred+(1-y_true)log(1-y_pred))

loss＝cross_entropy-log(jaccard_approximation)

the multi-scale prediction process in the step (3) specifically includes:

Width is

Width is

The high-resolution remote sensing image to be predicted is shown in figure 1, and the segmentation result of the high-resolution remote sensing image is shown in figure 2. Experimental results show that the method can effectively relieve the problems of edge effect and inter-class competition, and improve the segmentation precision of the remote sensing image.

Claims

1. A high-resolution remote sensing image multi-scale segmentation method is characterized by comprising the following steps:

(1) cutting, normalizing and data enhancing the high-resolution remote sensing image;

(2) constructing a neural network, and training a multi-scale segmentation model on the remote sensing image processed in the step (1);

2. The method according to claim 1, wherein the step (1) specifically comprises:

(11) randomly cutting out a picture with a moderate size from the high-resolution remote sensing image, and enhancing data such as up-down turning, left-right turning, right-angle rotation, random contrast, random saturation and the like on the cut picture to expand the diversity of a training sample; carrying out the same processing on the label picture to keep the newly generated label picture and the training sample synchronous;

3. The method according to claim 1, wherein the step (2) specifically comprises:

(21) the segmentation network adopts an encoder-decoder structure, ResNet101 with a fully connected layer removed is used as an encoder, the decoder part structure is that the output of the fifth block of ResNet101 is added with the output of the fourth block through double upsampling, the sum of the two upsampling and the output of the third block is added with the output of the second block through double upsampling, finally the input size of the network is upsampled, the output of the original size picture of a training sample, the 0.75-time size picture and the 1.25-time size picture after passing through the network is shrunk back to the original size, the outputs of the three sizes are spliced together, n types are set in the label picture, the outputs of the network respectively obtain n binary branches through n convolutional layers, the i (i is 0,1,2, …, i < n) th branch represents the probability that the current pixel is the ith type,

cross_entropy＝-∑(y_truelogy_pred+(1-y_true)log(1-y_pred))

loss＝cross_entropy-log(jaccard_approximation)。

4. the method according to claim 1, wherein the step (3) specifically comprises:

Width is

(32) Taking one image A out of the remote sensing images filled in the jth (j is 0), normalizing the image A as described in (12), inputting the image A into a trained segmentation model to obtain n probability maps as described in (21), splicing the n probability maps into A ', turning the image A up and down to obtain an image B, turning the image A left and right to obtain an image C, and performing the same operation on the images B and C to respectively obtain probability maps B ' and C '. Calculating the mean value of the probability graph and A ' obtained by respectively turning B ' and C ' up and down and left and right to obtain a final prediction probability graph;

Width is

Cutting the probability graph into a probability graph with the height h and the width w;