CN110414494A

CN110414494A - SAR image classification method with ASPP deconvolution network

Info

Publication number: CN110414494A
Application number: CN201910829626.6A
Authority: CN
Inventors: 王英华; 刘睿; 刘宏伟; 张磊; 王聪; 贾少鹏; 秦庆喜
Original assignee: Xian University of Electronic Science and Technology
Current assignee: Xian University of Electronic Science and Technology
Priority date: 2019-01-25
Filing date: 2019-09-03
Publication date: 2019-11-05
Anticipated expiration: 2039-09-03
Also published as: CN110414494B

Abstract

The invention discloses a kind of SAR image classification method with ASPP deconvolution network, mainly solve the problems, such as that classification accuracy is not high, classification effectiveness is low, complicated for operation in SAR image assorting process.The specific steps of the present invention are as follows: (1) reading training image and test image；(2) training set and test set are generated；(3) building has the deconvolution network of void space pyramid pond ASPP；(4) training band has the deconvolution network of empty spatial pyramid pond ASPP；(5) classify to test image.The present invention overcomes being unable to fully in existing SAR image sorting algorithm using multiscale target information, cannot achieve the problem of end-to-end classification, have the advantages that classification accuracy is high, classification is quick, easy to operate.

Description

SAR image classification method with ASPP deconvolution network

Technical field

The invention belongs to technical field of image processing, and it is free to further relate to one of Image Classfication Technology field band The synthetic aperture radar of hole spatial pyramid pond ASPP (Atrous Spatial Pyramid Pooling) deconvolution network SAR (Synthetic Aperture Radar) image classification method.The present invention can be used for in synthetic aperture radar SAR image Ground object target type classify, ground object target type such as construction area, farmland region, airfield runway region.

Background technique

Synthetic aperture radar SAR because its round-the-clock, it is round-the-clock, penetration power is strong the features such as, SAR imaging technique has obtained extensively Research.Thus SAR imaging technique is constantly mature, and the image resolution ratio of acquisition is higher and higher.SAR image classification is radar image One important branch of processing technology field.SAR image classification be to ground object target different in SAR image, as construction area, Classification is realized using different textural characteristics in farmland region, airfield runway region.Selected feature is determined in SAR image classification The quality of classification results is determined, common textural characteristics include mean value, variance, entropy, energy and gray level co-occurrence matrixes etc..Depth Learning method can automatically extract validity feature, solve the problems, such as that artificial selected characteristic is difficult in conventional method, wide at present The general SAR image that is applied to is classified.Common deep learning classification method includes classification method based on convolutional neural networks, is based on Classification method, classification method based on deconvolution neural network of depth confidence network etc..

Xian Electronics Science and Technology University its application patent document " based on super-pixel segmentation and convolution deconvolution network It is disclosed in SAR image classification method " (number of patent application: 201810512092.X, application publication number: CN 108764330A) A kind of SAR image classification method based on super-pixel segmentation and convolution deconvolution network.This method first send SAR image data Enter convolution deconvolution network, obtains the preliminary classification of image as a result, by the super-pixel segmentation result of SAR image data and initial point Class result combines to obtain final classification result.This method using the smooth SAR image of super-pixel segmentation preliminary classification as a result, gram The problem of time-consuming of classifying in the SAR image classification method based on deep learning has been taken, the nicety of grading of SAR image is improved. But the shortcoming that this method still has is, SAR image data is sent into convolution deconvolution neural network, without abundant Using the multiscale target information of SAR image, SAR image classification accuracy is still not high enough.

HeFei University of Technology is in a kind of patent document " SAR image classification side based on textural characteristics and DBN of its application It is disclosed in method " (number of patent application: 201710652513.4, application publication number: CN 107506699A) a kind of special based on texture It seeks peace the SAR image classification method of DBN.This method is mainly by the GLCM feature of gray level image block and initial data image block The strength information of GMRF feature and image is sent into depth confidence network DBN, obtains the classification results of image.This method introduces image Textural characteristics assist DBN network class, overcome the SAR image classification method based on deep learning and only image intensity utilized to believe The problem of breath, improves the nicety of grading of SAR image.But the shortcoming that this method still has is, utilizes the line of image It manages feature and assists DBN network class, do not directly obtain the classification results of SAR image to be sorted, cannot achieve and divide end to end Class, causes complicated for operation, and test is only capable of obtaining the classification results of a pixel every time, is unfavorable for Project Realization, SAR figure As classification effectiveness is low.

Summary of the invention

It is a kind of with ASPP deconvolution network it is an object of the invention in view of the above shortcomings of the prior art, propose SAR image classification method, mainly solve the classification accuracy in SAR image assorting process it is still not high enough, it is complicated for operation, classification The problem of low efficiency.

Realizing the thinking of the object of the invention is: reading training image and test image, is mentioned from training image using sliding window It takes training dataset and carries out data extending and generate training set, slided using sliding window from the different initial positions of test image Generating test set, building have the deconvolution network of void space pyramid pond ASPP, and training set is sent into deconvolution network It is trained to network parameter and restrains, obtain trained deconvolution network, be respectively sent into the different test set in initial position Trained deconvolution network is classified to obtain final classification result.

Specific steps of the invention include the following:

(1) training image and test image are read:

Reading two width includes the SAR image data of identical ground object target respectively as training image and test image, reading Any width SAR image data in include at least three classes ground object target, and pixel in any two classes ground object target of training image The more one kind of number a kind of pixel number Ratio control less with pixel number is in the range of 1 to 1.5；

(2) training set and test set are generated:

(2a) constructs the sliding window that a size is 90 × 90 pixels；

Sliding window is placed on the upper left corner of training image by (2b), is successively slided by sequence from left to right, from top to bottom, sliding Step-length is 10 pixels, will slide the training image region being overlapped in result with sliding window as a training image block of pixels；

Every other number of type in all pixels point, big by the number of classification in each training image block of pixels of (2c) statistics In being equal to classification of 0.5 times of the classification of sliding window pixel sum as the block of pixels, a sample is constituted；

The other all samples of every type are constituted such training dataset by (2d)；

(2e) upsets sequence to every class training data set at random and randomly selects equal number of sample, forms initial instruction Practice sample set, wherein the number of training that every class is chosen is not less than 2500；

(2f) rotates clockwise 90 for centered on the central pixel point of each training sample in initial training sample set Degree, obtains the first expansion sets；

(2g) is 0 to each training sample addition mean value in initial training sample set, and the Gauss that standard deviation is 0.1 makes an uproar Sound obtains the second expansion sets；

Initial training sample set, the first expansion sets, the second expansion sets are formed training set by (2h)；

Sliding window is placed on 5 different initial positions of test image by (2i), successively by sequence from left to right, from top to bottom Sliding, sliding step are 90 pixels, will slide the test image region being overlapped in result with sliding window as a test image Block of pixels separately constitutes first, second, third and fourth, five test sets；

(3) building has the deconvolution network of void space pyramid pond ASPP:

(3a) builds a deconvolution network, and structure is successively are as follows: input layer → 1 → pond of convolutional layer layer → nested type mould Block → 1 → convolutional layer of splicing layer 2 → up-sampling layer → output layer；

The nested type module is made of first void space pyramid pond ASPP with branch 1 parallel, branch 1 Structure is followed successively by 3 → nested type of convolutional layer submodule 1 → splicing 2 → convolutional layer of layer, 4 → warp lamination 1；

The nested type submodule 1 is made of second void space pyramid pond ASPP with branch 2 parallel；Branch 2 structure is followed successively by 5 → nested type of convolutional layer submodule 2 → splicing 3 → convolutional layer of layer, 6 → warp lamination 2；

The nested type submodule 2 is made of third void space pyramid pond ASPP with branch 3 parallel；Branch 3 structure is followed successively by convolutional layer 7 → four void space pyramid pond ASPP → warp lamination 3；

The parameter of (3b) setting deconvolution network；

(4) training band has the deconvolution network of empty spatial pyramid pond ASPP:

Training set is input in the deconvolution network with void space pyramid pond ASPP, training band has empty sky Between pyramid pond ASPP deconvolution network, until network parameter restrain, obtain trained deconvolution network；

(5) classify to test image:

First, second, third and fourth, five test sets are successively input in trained deconvolution network by (5a) respectively, are obtained every The preliminary classification block of pixels of a test set；

(5b) finds the most label of each pixel corresponding position frequency of occurrence from all preliminary classification block of pixels and makees For the classification of the pixel, final classification block of pixels is obtained, using the final classification block of pixels as the final classification of test image As a result.

Compared with the prior art, the present invention has the following advantages:

First, since present invention uses four void space pyramid pond ASPP, cavity is carried out to characteristic pattern respectively The operation of spatial pyramid pondization overcomes caused by algorithm existing in the prior art underuses because of multiscale target information The not high enough problem of classification accuracy, so that the present invention improves the accuracy rate of SAR image classification.

Second, since present invention uses up-sampling layers to carry out up-sampling operation to characteristic pattern, it can once obtain one The classification results of block of image pixels overcome algorithm existing in the prior art and are only capable of obtaining a pixel because testing every time The low problem of SAR image classification effectiveness caused by classification results, so that the present invention improves classification effect in SAR image classification Rate.

Third obtains the first of test set since test set is directly input in trained deconvolution network by the present invention Beginning classified pixels block, overcoming cannot achieve the problem of classifying end to end existing for algorithm in the prior art, so that this hair It is bright simple in SAR image sort operation.

Detailed description of the invention

Attached drawing 1 is flow chart of the invention；

Attached drawing 2 is schematic network structure of the invention；

Attached drawing 3 is the SAR image of actual measurement used in emulation experiment of the present invention；

Attached drawing 4 is the authentic signature figure of the SAR image of actual measurement used in emulation experiment of the present invention；

Attached drawing 5 is the classification results figure of emulation experiment of the present invention.

Specific embodiment

The present invention will be further described with reference to the accompanying drawing.

Referring to Fig.1, the specific steps realized to the present invention are further described.

Step 1, training image and test image are read.

Reading two width includes the SAR image data of identical ground object target respectively as training image and test image, reading Any width SAR image data in include at least three classes ground object target, and pixel in any two classes ground object target of training image The more one kind of number a kind of pixel number Ratio control less with pixel number is in the range of 1 to 1.5.

Step 2, training set and test set are generated.

Construct the sliding window that a size is 90 × 90 pixels.

Sliding window is placed on to the upper left corner of training image, is successively slided by sequence from left to right, from top to bottom, sliding step For 10 pixels, the training image region being overlapped in result with sliding window will be slided as a training image block of pixels.

Every other number of type in all pixels point is counted in each training image block of pixels, the number of classification is greater than etc. In classification of 0.5 times of the classification as the block of pixels of sliding window pixel sum, a sample is constituted.

The other all samples of every type are constituted to such training dataset.

Sequence is upset to every class training data set at random and randomly selects equal number of sample, forms initial training sample This collection, wherein the number of training that every class is chosen is not less than 2500.

Centered on the central pixel point of each training sample in initial training sample set, 90 degree are rotated clockwise, Obtain the first expansion sets.

It is 0 to each training sample addition mean value in initial training sample set, the Gaussian noise that standard deviation is 0.1 obtains To the second expansion sets.

Initial training sample set, the first expansion sets, the second expansion sets are formed into training set.

Sliding window is placed on 5 different initial positions of test image, is successively slided by sequence from left to right, from top to bottom Dynamic, sliding step is 90 pixels, will slide the test image region being overlapped in result with sliding window as a test image picture Plain block separately constitutes first, second, third and fourth, five test sets.

Sliding window is placed on 5 different initial positions of test image and refers to, using the upper left corner of test image as initial position Obtained test set is the first test set；It is obtained using the test image upper left corner to the position of 30 pixels of right translation as initial position The test set arrived is the second test set；It is obtained using the test image upper left corner to the position of 60 pixels of right translation as initial position Test set be third test set；What the position for translating 30 pixels downwards using the test image upper left corner was obtained as initial position Test set is the 4th test set；The survey that the position for translating 60 pixels downwards using the test image upper left corner is obtained as initial position Examination collection is the 5th test set.

Step 3, building has the deconvolution network of void space pyramid pond ASPP.

Referring to Fig. 2, network structure of the invention is further described.

A deconvolution network is built, structure is successively are as follows: input layer → 1 → pond of convolutional layer layer → nested type module → Splice 1 → convolutional layer of layer 2 → up-sampling layer → output layer.

The nested type module is made of first void space pyramid pond ASPP with branch 1 parallel, branch 1 Structure is followed successively by 3 → nested type of convolutional layer submodule 1 → splicing 2 → convolutional layer of layer, 4 → warp lamination 1.

The nested type submodule 1 is made of second void space pyramid pond ASPP with branch 2 parallel；Branch 2 structure is followed successively by 5 → nested type of convolutional layer submodule 2 → splicing 3 → convolutional layer of layer, 6 → warp lamination 2.

The nested type submodule 2 is made of third void space pyramid pond ASPP with branch 3 parallel；Branch 3 structure is followed successively by convolutional layer 7 → four void space pyramid pond ASPP → warp lamination 3.

First, second, third, each structure of the 4th void space pyramid pond ASPP successively are as follows: simultaneously 4 capable branches → splicing layer → convolutional layer, the 1st in 4 parallel branches, the 2nd, the 3rd branch be sky Hole convolutional layer, the structure of the 4th branch are followed successively by pond layer → up-sampling layer.

The parameter setting of void space pyramid pond ASPP is as follows:

The spreading rate of the empty convolutional layer of 1st, the 2nd, the 3rd branch is respectively set to 2,4,8 pixels, each Empty convolution kernel size is 3 × 3, and each sliding step is 1 pixel.

Global average pond is set by the pond mode of the pond layer of the 4th branch.

Bilinear interpolation up-sampling is set by the up-sampling layer of the 4th branch.

The convolution kernel of convolutional layer is dimensioned to 1 × 1, sliding step is 1 pixel.

A matrix splicing function is set by splicing layer.

The parameter of deconvolution network is set.

The parameter setting of the deconvolution network is as follows:

32 are set by the number of the Feature Mapping figure of convolutional layer 1, convolution kernel size is 5 × 5, and sliding step is 1 Pixel.

Set the number of the Feature Mapping figure of convolutional layer 2 to total classification of the included ground object target of SAR image to be sorted Number, convolution kernel size are 1 × 1, and sliding step is 1 pixel.

The number of the Feature Mapping figure of convolutional layer 3,5,7 is respectively set to 64,128,256, each convolution kernel size It is 5 × 5, each sliding step is 2 pixels.

The number of the Feature Mapping figure of convolutional layer 4,6 is respectively set to 64,128, each convolution kernel size is 1 × 1, each sliding step is 1 pixel.

Maximum pond is set by the pond mode of pond layer, the number of Feature Mapping figure is 32, pond window size It is 2 × 2, sliding step is 2 pixels.

The number of the Feature Mapping figure of warp lamination 1,2,3 is respectively set to 64,128,256, each convolution kernel is big Small is 5 × 5, and each sliding step is 2 pixels.

By first, second, third, the 4th void space pyramid pond ASPP Feature Mapping figure number It is respectively set to 32,64,128,256.

Splicing layer 1,2,3 is respectively set to a matrix splicing function.

Bilinear interpolation up-sampling is set by up-sampling layer.

Step 4, training band has the deconvolution network of empty spatial pyramid pond ASPP.

Training set is input in the deconvolution network with void space pyramid pond ASPP, training band has empty sky Between pyramid pond ASPP deconvolution network, until network parameter restrain, obtain trained deconvolution network.

Step 5, classify to test image.

Successively first, second, third and fourth, five test sets are input in trained deconvolution network respectively, obtain each survey Try the preliminary classification block of pixels of collection.

The most label conduct of each pixel corresponding position frequency of occurrence is found from all preliminary classification block of pixels should The classification of pixel obtains final classification block of pixels, using the final classification block of pixels as the final classification result of test image.

Effect of the invention can be illustrated by following emulation experiment.

1. emulation experiment condition:

Emulation experiment of the invention be dominant frequency 2.9GHz Intel Xeon (R) CPU E3-1535M v5, interior save as It is carried out under the hardware environment and MATLAB R2017a of 31.1GB, the software environment of Python2.7.

2. emulation content and interpretation of result:

Emulation experiment of the invention is that the present invention is respectively adopted with two kinds of prior arts (based on multistage local mode histogram The support vector machines classification method of MLPH feature, the classification method based on super-pixel segmentation and convolution deconvolution network) respectively Classify to the width test image in the SAR image of actual measurement.

Fig. 3 is the SAR image of actual measurement used in emulation experiment of the present invention, and wherein Fig. 3 (a) is the width training figure of actual measurement Picture, Fig. 3 (b) are a width test image of actual measurement.

Fig. 4 is the authentic signature figure of the SAR image of actual measurement used in emulation experiment of the present invention, and wherein Fig. 4 (a) is Fig. 3 (a) The authentic signature figure of one width training image of actual measurement, Fig. 4 (b) are the authentic signature figure of a width test image of Fig. 3 (b) actual measurement.

Fig. 5 is the classification results figure of emulation experiment of the present invention, and wherein Fig. 5 (a) is a width of the present invention to Fig. 3 (b) actual measurement The classification results figure of test image；Fig. 5 (b) is the svm classifier method using the prior art based on MLPH feature, real to Fig. 3 (b) The classification results figure for the width test image surveyed；Fig. 5 (c) is to be based on super-pixel segmentation and convolution deconvolution net using the prior art The classification method of network, to the classification results figure of a width test image of Fig. 3 (b) actual measurement.Comparison diagram 5 (a), Fig. 5 (b), Fig. 5 (c), The method of the present invention is smoother for the boundary classification of Fig. 3 (b) width test image surveyed, and surveys referring to a width of Fig. 4 (b) actual measurement Attempt the authentic signature figure of picture, it is seen that classification results Fig. 5 (a) of the invention is compared to now there are two types of technology classification result figures 5 (b), Fig. 5 (c) illustrates that classification of the invention is more acurrate, classifying quality is more preferable closer to authentic signature figure.

Table 1 is to be utilized respectively the method for the present invention and now divide there are two types of technology the width test image that Fig. 3 (b) is surveyed The performance parameter of class compares, and the F1 in table indicates the svm classifier method based on MLPH feature, F2 indicate based on super-pixel segmentation and The classification method of convolution deconvolution network, F3 indicate the method for the present invention.Classification accuracy is the width surveyed to Fig. 3 (b) in table The correct pixel number of the classification that test image is classified, with the total pixel number purpose ratio of the test image, classification is just Really refer to that the classification results classified to test image authentic signature corresponding with preceding test image of classifying is consistent.Average mark Class accuracy rate is to classify to the width test image that Fig. 3 (b) is surveyed, the mean value of the classification accuracy of all kinds of ground object targets； Runing time is to carry out runing time used of classifying to the width test image that Fig. 3 (b) is surveyed.

1. 3 kinds of classification methods of table are to the sorted performance parameter contrast table of test image

As it can be seen that the method for the present invention tests a width of Fig. 3 (b) actual measurement compared to now there are two types of technologies from contrast table 1 The average classification accuracy of image is higher, and runing time is also most short.Illustrate the present invention by utilizing four void space pyramids Pond ASPP extracts the multiscale target information of SAR image, improves classification accuracy, by utilizing warp lamination and up-sampling Layer, which to test every time, obtains the classification results of SAR image block of pixels, realizes and classifies end to end, shortens runing time, Convenient for Project Realization.

Claims

1. a kind of SAR image classification method with ASPP deconvolution network, which is characterized in that read training image and test chart Picture generates training set and test set, and building has the deconvolution network of void space pyramid pond ASPP, and training band has cavity The step of deconvolution network of spatial pyramid pond ASPP, classifies to test image, this method include the following:

(1) training image and test image are read:

Reading two width includes the SAR image data of identical ground object target respectively as training image and test image, times of reading It include at least three classes ground object target, and pixel number in any two classes ground object target of training image in one width SAR image data More one kind a kind of pixel number Ratio control less with pixel number is in the range of 1 to 1.5；

(2) training set and test set are generated:

(2a) constructs the sliding window that a size is 90 × 90 pixels；

Sliding window is placed on the upper left corner of training image by (2b), is successively slided by sequence from left to right, from top to bottom, sliding step For 10 pixels, the training image region being overlapped in result with sliding window will be slided as a training image block of pixels；

Every other number of type in all pixels point in each training image block of pixels of (2c) statistics, the number of classification is greater than etc. In classification of 0.5 times of the classification as the block of pixels of sliding window pixel sum, a sample is constituted；

(2e) upsets sequence to every class training data set at random and randomly selects equal number of sample, forms initial training sample This collection, wherein the number of training that every class is chosen is not less than 2500；

(2f) rotates clockwise 90 degree for centered on the central pixel point of each training sample in initial training sample set, Obtain the first expansion sets；

(2g) is 0 to each training sample addition mean value in initial training sample set, and the Gaussian noise that standard deviation is 0.1 obtains To the second expansion sets；

Sliding window is placed on 5 different initial positions of test image by (2i), is successively slided by sequence from left to right, from top to bottom Dynamic, sliding step is 90 pixels, will slide the test image region being overlapped in result with sliding window as a test image picture Plain block separately constitutes first, second, third and fourth, five test sets；

(3) building has the deconvolution network of void space pyramid pond ASPP:

(3a) builds a deconvolution network, and structure is successively are as follows: and input layer → 1 → pond of convolutional layer layer → nested type module → Splice 1 → convolutional layer of layer 2 → up-sampling layer → output layer；

The nested type module is made of first void space pyramid pond ASPP with branch 1 parallel, the structure of branch 1 It is followed successively by 3 → nested type of convolutional layer submodule 1 → splicing 2 → convolutional layer of layer, 4 → warp lamination 1；

The parameter of (3b) setting deconvolution network；

Training set is input in the deconvolution network with void space pyramid pond ASPP, training band has void space golden The deconvolution network of word tower basin ASPP obtains trained deconvolution network until network parameter convergence；

(5) classify to test image:

First, second, third and fourth, five test sets are successively input in trained deconvolution network by (5a) respectively, obtain each survey Try the preliminary classification block of pixels of collection；

(5b) finds the most label conduct of each pixel corresponding position frequency of occurrence from all preliminary classification block of pixels should The classification of pixel obtains final classification block of pixels, using the final classification block of pixels as the final classification result of test image.

2. the SAR image classification method according to claim 1 with ASPP deconvolution network, which is characterized in that step Sliding window described in (2i) is placed on 5 different initial positions of test image and refers to, using the upper left corner of test image as starting The test set that position obtains is the first test set；Using the test image upper left corner to the position of 30 pixels of right translation as start bit The test set set is the second test set；Using the test image upper left corner to the position of 60 pixels of right translation as initial position Obtained test set is third test set；The position for translating 30 pixels downwards using the test image upper left corner is obtained as initial position The test set arrived is the 4th test set；The position for translating 60 pixels downwards using the test image upper left corner is obtained as initial position Test set be the 5th test set.

3. the SAR image classification method according to claim 1 with ASPP deconvolution network, which is characterized in that step First described in (3a), second, third, each structure of the 4th void space pyramid pond ASPP successively Are as follows: parallel 4 branches → splicing layer → convolutional layer, the 1st in 4 parallel branches, the 2nd, the 3rd branch it is equal For empty convolutional layer, the structure of the 4th branch is followed successively by pond layer → up-sampling layer.

4. the SAR image classification method according to claim 3 with ASPP deconvolution network, which is characterized in that described Void space pyramid pond ASPP parameter setting it is as follows:

The spreading rate of the empty convolutional layer of 1st, the 2nd, the 3rd branch is respectively set to 2,4,8 pixels, each cavity Convolution kernel size is 3 × 3, and each sliding step is 1 pixel；

Global average pond is set by the pond mode of the pond layer of the 4th branch；

Bilinear interpolation up-sampling is set by the up-sampling layer of the 4th branch；

The convolution kernel of convolutional layer is dimensioned to 1 × 1, sliding step is 1 pixel；

A matrix splicing function is set by splicing layer.

5. the SAR image classification method according to claim 1 with ASPP deconvolution network, which is characterized in that step The parameter setting of deconvolution network described in (3b) is as follows:

32 are set by the number of the Feature Mapping figure of convolutional layer 1, convolution kernel size is 5 × 5, and sliding step is 1 pixel；

It sets the number of the Feature Mapping figure of convolutional layer 2 to total classification number of the included ground object target of SAR image to be sorted, rolls up Product core size is 1 × 1, and sliding step is 1 pixel；

The number of the Feature Mapping figure of convolutional layer 3,5,7 is respectively set to 64,128,256, each convolution kernel size is 5 × 5, each sliding step is 2 pixels；

The number of the Feature Mapping figure of convolutional layer 4,6 is respectively set to 64,128, each convolution kernel size is 1 × 1, often A sliding step is 1 pixel；

Set maximum pond for the pond mode of pond layer, the number of Feature Mapping figure is 32, pond window size for 2 × 2, sliding step is 2 pixels；

The number of the Feature Mapping figure of warp lamination 1,2,3 is respectively set to 64,128,256, each convolution kernel size is equal It is 5 × 5, each sliding step is 2 pixels；

First, second, third, the number of Feature Mapping figure of the 4th void space pyramid pond ASPP are distinguished It is set as 32,64,128,256；

Splicing layer 1,2,3 is respectively set to a matrix splicing function；

Bilinear interpolation up-sampling is set by up-sampling layer.