CN114663439A - Remote sensing image land and sea segmentation method - Google Patents

Remote sensing image land and sea segmentation method Download PDF

Info

Publication number
CN114663439A
CN114663439A CN202210280187.XA CN202210280187A CN114663439A CN 114663439 A CN114663439 A CN 114663439A CN 202210280187 A CN202210280187 A CN 202210280187A CN 114663439 A CN114663439 A CN 114663439A
Authority
CN
China
Prior art keywords
sea
land
sensing image
remote sensing
segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210280187.XA
Other languages
Chinese (zh)
Inventor
郭海涛
卢俊
龚志辉
阎晓东
张衡
林雨准
刘相云
高慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Engineering University of PLA Strategic Support Force
Original Assignee
Information Engineering University of PLA Strategic Support Force
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Engineering University of PLA Strategic Support Force filed Critical Information Engineering University of PLA Strategic Support Force
Priority to CN202210280187.XA priority Critical patent/CN114663439A/en
Publication of CN114663439A publication Critical patent/CN114663439A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a remote sensing image sea-land segmentation method, and belongs to the technical field of remote sensing image processing. The invention adopts the framework mode of an encoder and a decoder, realizes the extraction of different size characteristics by using a multilayer coding module of a Res2Net network, and performs characteristic enhancement on characteristic graphs of different scales by using a compression and attention module so as to strengthen the information of the sea-land weak boundary; in the training process, a deep supervision strategy is used, and the output results of different decoding modules are respectively trained, so that the capability of network learning of target boundary information is enhanced, and the accuracy of remote sensing image sea-land boundary segmentation is improved. Two groups of remote sensing image data sets containing different coast types are selected for experiments, and the results show that the method can obtain more accurate sea and land segmentation results and clearer and more complete sea and land boundaries.

Description

Remote sensing image land and sea segmentation method
Technical Field
The invention relates to a remote sensing image sea-land segmentation method, and belongs to the technical field of remote sensing image processing.
Background
With the rapid development of remote sensing technology, remote sensing data acquisition means are more diversified, the spatial resolution, the time resolution, the spectral resolution and the radiation resolution of remote sensing images are also continuously improved, and sufficient data support is provided for the research of large-scale coastal areas. On the remote sensing image, the ocean area and the land area are effectively distinguished, the sea and land segmentation with high speed and high precision is realized, and the method has important application values for coastline extraction, island reef identification, offshore target detection and the like. Traditional land and sea segmentation comprises a threshold segmentation method, an active contour model method, a region growing method, a Markov random field-based method and the like, and mainly depends on the difference of land and sea in images in the aspects of gray scale, texture and the like to perform segmentation, so that a better segmentation result can be obtained under the conditions of obvious land and sea boundary gray scale difference and simple water edge shape, but the segmentation result is easily interfered by noise, the result needs to be regulated and controlled by artificially set parameters, and the robustness is poor.
Deep learning, particularly Convolutional Neural Networks (CNNs), has achieved superior performance over traditional methods in the fields of image classification, target detection, semantic segmentation, and the like, and the occurrence of full Convolutional Neural Networks (FCNs) has brought attention to the semantic segmentation technology based on CNNs. Most of the semantic segmentation networks proposed in recent years are based on the design principle of FCN, and SegNet, U-Net, and their variants (UNet + +, restore) are typical networks, which employ an encoding-decoding structure, and are composed of an encoding path for feature extraction and a decoding path for restoring the resolution of a feature map, and thus can obtain a more detailed semantic segmentation result by fully utilizing semantic information of each layer. The PSPNet utilizes a Pyramid Pooling module (SPP) to extract multi-scale information of an image, the Deeplabv3+ applies a hole convolution to the semantic segmentation field, and provides an empty space Pyramid Pooling module (ASPP) which can better extract the multi-scale information of the image by carrying out parallel sampling on the empty convolutions with different sampling rates. A double attention Network (DANet), a Cross attention Network (CCNet) and the like introduce an attention mechanism into a semantic segmentation Network, areas with similar features in an image have the same response through correlation measurement calculation, and therefore learning of the features of a specific area and utilization of effective information by the semantic segmentation Network are enhanced. In addition, a Bilateral Segmentation Network (BiseNet), a BiseNet v2, etc. are used to balance the speed and accuracy of the semantic Segmentation Network, so as to achieve the purpose of real-time Segmentation.
The rapid development of the semantic segmentation network provides sufficient theoretical support for the remote sensing image sea-land segmentation by using CNNs, and at present, scholars construct a deep network based on an encoding-decoding architecture and simultaneously eliminate the hole phenomenon in a prediction result by combining a post-processing method. Some have constructed a deep DeepUNet network than U-Net by using ResNet's residual Block (Res _ Block), designed DownBlock and UpBlock modules to replace convolutional layers in the encoding-decoding structure, and obtained better segmentation results than the original U-Net in the optical remote sensing image sea-land segmentation task. Also, Res _ Block is used to construct Res-UNet, and a CRF (Conditional Random Field, CRF) model and morphological operation are used to perform post-processing on the segmentation result. Pourba and the like construct a network with proper depth to realize end-to-end sea-land segmentation by using aggregated multi-scale context information of a densely connected residual block system based on a standard U-shaped network structure. In addition, the sea-land segmentation task also focuses on the accuracy of sea-land boundary segmentation, and some scholars propose a multi-task framework aiming at the problem, and the sea-land segmentation accuracy is improved by expanding network branches. For example, a multitask network SeNet is proposed to perform sea-land segmentation and edge detection simultaneously, but a large number of standard convolutions exist in the SeNet network, a large amount of storage space is occupied, and a large amount of running time is consumed. A multitask network fusion net combined with edge information is also proposed, which extends the output of a branch-structured edge network from an encoding-decoding structure, and trains and learns boundary semantic information in parallel with a segmentation network to obtain a segmentation result with consistent space and good boundary positioning. And the Dysoma algorithm and the like reduce the number of convolution layers of a spatial path in the BiSeNet network aiming at the characteristics of the SAR image, and simultaneously provide an edge enhancement loss function strategy to improve the segmentation capability of the model.
The coastline types in China are complex and various, spectral, texture and shape characteristics of land objects in different types of coastlines are different, weak boundaries (silt coastlines) and strong boundaries (artificial coastlines) are alternately distributed, and the existing research can obtain a better segmentation result under the condition that the sea-land boundaries are simple, but cannot realize a sea-land segmentation task under a complex scene; in addition, the remote sensing image sea-land segmentation task also focuses on the boundary segmentation result, the pixel proportion of the sea-land boundary in the remote sensing image is low, and the problem of unbalanced samples exists. Therefore, the accuracy of the segmentation result of the existing network at the sea-land boundary is difficult to guarantee, and the extraction capability of the network to the boundary is difficult to embody by only evaluating the region segmentation result in the research.
Disclosure of Invention
The invention aims to provide a remote sensing image sea-land segmentation method to solve the problem that the segmentation at the sea-land boundary of the current remote sensing image is inaccurate.
The invention provides a remote sensing image land and sea segmentation method for solving the technical problems, which comprises the following steps:
1) obtaining a remote sensing image, and performing label making on a remote sensing image data set to form corresponding training set data;
2) establishing a remote sensing image sea-land segmentation model, wherein the sea-land segmentation model adopts an encoder and a decoder structure, the encoder adopts a plurality of layers of encoding modules, and each layer of encoding module is used for extracting different scale characteristics of the remote sensing image; the decoder comprises a plurality of layers of decoding modules corresponding to the plurality of layers of coding modules, wherein each layer of decoding module is used for up-sampling the output characteristics of the corresponding layer of coding module and the output characteristics of the previous layer of coding module to the size of an original image, fusing the characteristics processed by each layer of decoding module and carrying out edge detection on the fused characteristic image;
3) training the established remote sensing image sea-land segmentation model by using the training set data, respectively training each layer of decoding module by adopting a depth supervision strategy during training, and constructing a total loss function of the remote sensing image sea-land segmentation model according to the loss function of each layer of decoding module;
4) and acquiring the remote sensing image to be segmented, and inputting the remote sensing image to be segmented into the trained remote sensing image sea-land segmentation model so as to realize the sea-land segmentation of the remote sensing image to be segmented.
The invention adopts the framework modes of the encoder and the decoder, realizes the extraction of different size characteristics by utilizing a multilayer coding module, and improves the expression of the remote sensing image by extracting the different size characteristics; in the training process, a deep supervision strategy is used, and the output results of different decoding modules are respectively trained, so that the capability of network learning of target boundary information is enhanced, and the accuracy of remote sensing image sea-land boundary segmentation is improved.
Furthermore, the multilayer coding module adopts a Res2Net network and comprises 5 layers of coding modules, the first layer of coding module comprises a convolution layer and a maximum pooling layer and is used for carrying out feature extraction on the input remote sensing image, and the other layers of coding modules all adopt residual blocks and are used for processing the output result of the previous layer of coding module.
Further, the residual Block adopts Res2_ Block.
The invention divides the feature mapping in the Res2_ Block residual Block into a plurality of channel groups, and designs a connection similar to the residual between different channel groups, so that the network improves the multi-scale expression capability on a finer-grained level.
Furthermore, the remote sensing image sea-land segmentation model also comprises a compression and attention module, and the output result of each layer of coding module is processed by the corresponding compression and attention module.
The invention utilizes the compression and attention module to promote useful features and restrain features which are not useful for the current task, and the extraction capability of the network to the weak boundary features is enhanced by giving greater weight to the feature map at the weak sea-land boundary.
Further, each layer decoding module includes one Res2_ Block and one Upsample for gradually restoring the feature map to the original input image size.
Further, the loss function during the training of the remote sensing image sea-land segmentation model is as follows:
Figure BDA0003556536620000041
wherein M is equal to the number of layers of the decoding module and is also the number of layers of the encoding module;
Figure BDA0003556536620000042
represents a loss function of the mth layer decoding module; lfuseRepresenting the loss function after the fusion of the decoding modules of each layer;
Figure BDA0003556536620000043
and WfuseRespectively representing the weight of the loss function of the m-th layer decoding module and the weight of the fused loss function.
Further, the loss function of each layer of decoding module and the fused loss function both adopt a BCE loss function of a semantic segmentation two-classification task.
Further, the BCE loss function is:
Figure BDA0003556536620000051
wherein (r, c) represents the coordinates of the pixel point, H, W represents the height and width of the image, PG(r,c)And PS(r,c)Respectively representing the true value and the predicted value of the pixel point.
Drawings
FIG. 1 is a schematic structural diagram of a sea-land segmentation model of a remote sensing image adopted by the present invention;
FIG. 2a is a block diagram of the residual block of ResNet;
FIG. 2b is a block diagram of the residual block of Res2Net employed in the present invention;
FIG. 3a is a block diagram of an SE module;
FIG. 3b is a block diagram of an SE module employed in the present invention;
FIG. 4a is the training Data and corresponding sample labels for the first scene in Data1 in the experimental example of the present invention;
FIG. 4b shows training Data and corresponding sample labels for a second scenario in Data1 in an example of the present invention;
FIG. 4c is the training Data and corresponding sample labels for the first scene in Data2 in the experimental example of the present invention;
FIG. 4d shows training Data and corresponding sample labels for the second scenario in Data2 in the experimental example of the present invention;
FIG. 5 is a graph showing a comparison of the segmentation results of the present invention and other conventional segmentation models in the experimental examples.
Detailed Description
The following further describes embodiments of the present invention with reference to the drawings.
The method adopts a network model with a coding-decoding structure as a sea-land segmentation model of the remote sensing image, wherein a novel backbone network Res2Net is adopted as a coder to extract multi-scale features of the image, and the extracted features of each layer are processed by a compression and attention module to enhance the extraction capability of the network to weak boundaries; during decoding, upsampling (Upsample) of each feature map is used as side output of a network, the side outputs are combined to realize multi-level feature fusion, a depth supervision strategy is used for each feature fusion result, and finally edge detection is carried out on a sea and land segmentation result output by the network, so that a water line is obtained, and sea and land segmentation of a remote sensing image is realized.
1. And acquiring a remote sensing image data set, and performing label making on the remote sensing image data set to form corresponding training set data.
For the embodiment, the sea and land categories of the remote sensing image in the data set can be labeled according to pixels to generate a corresponding label remote sensing image, and the generated label remote sensing image is used as a training set.
2. And establishing a remote sensing image sea-land segmentation model.
The invention establishes a remote sensing image sea-land segmentation model as shown in figure 1, wherein the model is a Multi-Scale Deep Supervision U-shaped network (MSDSONT), and comprises an encoder and a decoder, wherein the encoder adopts a Res2Net network and a compression and attention module; the decoder comprises 5 layers of decoding modules which respectively correspond to the output characteristics of 5 layers of the encoder in different scales, and each layer of decoding module respectively consists of Res2_ Block and Upesample.
The convolutional neural network can extract more abstract features by continuously increasing the depth and parameters of the network, and the deepening of the network layer number can cause the problems of gradient disappearance, explosion, network degradation and the like, so that the network is difficult to converge, and therefore, a ResNet network is adopted to overcome the situation. The residual block structure adopted by the ResNet network is shown in figure 2a, and input information can be directly transmitted to a later layer by adding a jump connection between input and output, so that the difficulty of network learning is simplified. On the basis, in order to improve the multi-scale expression capability on a finer-grained level, the invention adopts the Res2Net network to extract the multi-scale features.
The encoder in this embodiment uses a Res2Net-50 network, taking an input image size of 512 × 512 × 3 as an example, and each layer of detailed information is shown in table 1 and includes 5 layers of encoding modules and 5 compression and attention modules, and one layer of encoding module corresponds to one compression and attention module. The five-layer coding modules are Encoder _1, Encoder _2, Encoder _3, Encoder _4 and Encoder _5 respectively, and the 5 compression and attention modules are SA _1, SA _2, SA _3, SA _4 and SA _5 respectively. Encoder _1 includes convolutional layers and max-pooling layers, and Res2_ Block is adopted for Encoder _2, Encoder _3, Encoder _4 and Encoder _ 5.
TABLE 1
Figure BDA0003556536620000071
The residual unit structure (Res2_ Block) is shown in FIG. 2b, which divides the feature map into s subsets after 1 × 1 convolutional layer, each with xiRepresents where i ∈ {1,2,3, …, s }. Each feature map subset xiHas the same space size, but the channel number is 1/s of the original input feature map, and x is divided1Each outer xiAll have their corresponding 3X 3 convolutional layers, each with KiDenotes xiPassing through a convolutional layer KiRear output is set to yiThe feature map is then sub-set xiAnd through Ki-1Feature map subset x of (2)i-1Added input KiTo obtain an output yiWherein y isiThe expression is defined as follows:
Figure BDA0003556536620000072
from this it can be seen that each 3 x 3 convolution kernel K in Res2_ BlockiAll previous information of the feature map subset, i.e. { x }, can be receivedjJ ≦ i } of feature information, such that the feature map subset xjAfter a 3 x 3 convolution, a ratio x is obtainedjOutputting results with larger receptive fields; to allow information fusion at different scales, all y are combinediParallel and fused using a1 × 1 convolution. The grouping and merging strategy enables the convolutional layer to process the feature map more effectively, so that the Res2_ Block output feature map contains different receptive fields, which is beneficial to extracting multi-scale features, and thus, the network can capture local or global image features at a finer granularity level.
The input image is processed by a convolution layer and a maximum pooling layer of the Encoder _1 and then sequentially enters four encoding layers of the Encoder _2, the Encoder _3, the Encoder _4 and the Encoder _5, and the output result of each encoding layer respectively enters a corresponding compression and attention module.
SENET (Squeeze-and-Excitation Networks is the earliest oneThe proposed channel attention mechanism is divided into two parts, compression (Squeeze) and Excitation (Excitation). As shown in FIG. 3a, the known input feature map X ∈ RC*H*WC, H, W respectively representing the number, width and height of channels of the input feature map, compressing X by using an AvgresAnd multiplying channel by channel to realize channel recalibration of the characteristic diagram. Output X of SE ModuleoutCan be expressed as:
Figure BDA0003556536620000081
Xout=ω*Xres+Xres (3)
wherein the content of the first and second substances,
Figure BDA0003556536620000082
sigmoid function, σ (-) is ReLU activation function, ω1、ω2For the parameters of the two fully connected layers, avp (X) represents the global average pooling operation for X.
The SA module expands the re-weighting channel of the SE module into two parts, compression (Squeeze) and Attention (Attention), as shown in fig. 3b, where the Squeeze part is consistent with the SE module, and the Attention part introduces two convolution layers with convolution kernel size of 3 × 3 to gather non-local features, and then according to the importance of each feature channel, promotes useful features and suppresses features that are not useful for the current task. The output of two convolutional layers with the feature map X passing through the weighting channel is
Figure BDA0003556536620000083
Up-sampling it to XresSize is given by XattnX is to beattnAnd XresMultiplication channel by channel and XattnAdding channel by channel to obtain output XoutThe definition is as follows:
Figure BDA0003556536620000084
Xout=Xattn*Xres+Xattn (5)
wherein the content of the first and second substances,
Figure BDA0003556536620000085
represents the outputs of the weighted convolutional layers Conv1 and Conv2, and Up (-) represents the upsampling function. The compression and attention module of the invention adopts an SA module to realize that a feature map at the weak sea-land boundary is endowed with larger weight so as to enhance the extraction capability of the network on the weak boundary features.
Corresponding to 5 coding layers in the encoder, the decoder in this embodiment also employs 5 decoding modules, each decoding module is respectively composed of a Res2_ Block and an upscale, and is used for gradually restoring the feature map to the original input image size, and the size of the feature map is doubled every time the feature map passes through an upscale layer. Each decoding layer is fused with the feature map output by the corresponding level SA so as to better utilize the detail information of the shallow feature map and the semantic information of the deep feature map, thereby generating a more accurate sea and land segmentation result. In this embodiment, the feature maps of each layer at the decoding stage are up-sampled to the size of the original map as Side-Output (Side-Output), and the feature maps are subjected to dimensional superposition (Concat) as the final Output result of the network, so as to realize multi-layer feature fusion and improve the accuracy of land and sea segmentation. The msdsonnet decoding layer structure details are shown in table 2.
TABLE 2
Figure BDA0003556536620000091
3. And training the constructed remote sensing image sea-land segmentation model by adopting a depth supervision strategy.
The invention adopts a Deep supervision strategy (DS), and carries out loss calculation for each lateral output respectively, namely, the output of each decoding module is trained respectively, and the loss of each lateral output is reflected in a final loss function to supervise the training process of the network, wherein the calculation formula is defined as:
Figure BDA0003556536620000092
wherein the content of the first and second substances,
Figure BDA0003556536620000101
denotes the loss of lateral output,/fuseWhich represents the total loss of the final output,
Figure BDA0003556536620000102
and WfuseRepresenting a weight of each penalty, wherein the penalty function selects a BCE penalty function for the semantic segmentation binary task, defined as follows:
Figure BDA0003556536620000103
wherein (r, c) represents the coordinates of the pixel points, H, W represents the height and width of the image, PG(r,c)And PS(r,c)And respectively representing the true value and the predicted value of the pixel point. In the training process, the value of l tends to be the lowest through continuously learning the weight parameter of the network, thereby achieving the aim of network convergence.
And (3) training the sea-land segmentation model by using the training set formed in the step (1) in the above way to obtain the trained sea-land segmentation model.
4. And carrying out segmentation processing on the trained remote sensing image to be segmented.
Through the process, a trained sea-land segmentation model can be obtained, a remote sensing image to be segmented containing a sea-land boundary is obtained, the obtained remote sensing image is input into the sea-land segmentation model for segmentation, and a sea-land boundary segmentation result of the remote sensing image can be obtained.
In order to verify the effectiveness of the sea-land segmentation network adopted by the invention, two groups of open remote sensing image sea-land segmentation Data sets are selected for experiments, wherein the remote sensing image of an area A is recorded as a Data set 1(Data1), and the remote sensing image of an area B is recorded as a Data set 2(Data 2). Because the remote sensing image coverage is wide, and the demand of intensive prediction tasks such as semantic segmentation on computing resources is high, the large-amplitude remote sensing images of the two data sets are cut into a plurality of tiles during training and prediction, and the detailed information of the two data sets is shown in table 3.
TABLE 3
Figure BDA0003556536620000104
Figure BDA0003556536620000111
Two sets of Data set partial training Data and sample labels are shown in fig. 4a, 4b, 4c and 4d, where fig. 4a and 4b are training Data and corresponding sample labels in two scenes in Data1, fig. 4c and 4d are training Data and corresponding sample labels in two scenes in Data2, black pixels in the labels represent land areas, gray represents sea areas, and rivers and lakes on land are all regarded as land categories. In addition, the training label is subjected to edge detection to obtain a sea-land water line with the width of 1 pixel, and preparation is made for evaluating the accuracy of edge detection in the next step.
The present invention and the existing networks are evaluated for their respective properties on land-sea segmented data sets from two aspects, namely, region segmentation and boundary detection. F1 score (F1-score), average cross-Over ratio (MIOU) and average Absolute Error (MAE) were used as evaluation indexes for region segmentation, and F1 score (F1-score of boundary, F1-score-b) was used as an evaluation index for boundary detection.
(1) F1-score; f1-score is a harmonic mean value of precision and recall, the higher the F1 value is, the higher the accuracy of the network sea-land segmentation is indicated, and F1 herein refers to the average value of two categories of sea and land, and the expression is as follows:
Figure BDA0003556536620000112
Figure BDA0003556536620000113
Figure BDA0003556536620000114
in the formula, tp (true positive) indicates that a positive class is determined as a positive class, fp (false positive) indicates that a negative class is determined as a positive class, fn (false negative) indicates that a negative class is determined as a negative class, tn (true negative) indicates that a negative class is determined as a negative class, P and R are an accuracy and a recall, n is a segmentation class, and n is 2 in this test.
(2) MIOU; the IOU represents the ratio of the intersection and union of the target actual position and the predicted position, and MIOU is the average value of each type of IOU, and the expression is as follows:
Figure BDA0003556536620000121
(3) the MAE is to calculate the average absolute error between the prediction result and the true value label by taking a pixel point as a unit, so that the actual situation of the prediction error can be better reflected, and the prediction result obtained by the network is closer to the true value label graph as the MAE value is smaller, as shown in the following formula:
Figure BDA0003556536620000122
in the formula, PiRepresenting the segmentation result, y, of the network output at pixel iiIs the true value marked here.
(4) F1-score-b is defined as the accuracy of the predicted boundary pixel and the true-labeled boundary pixel in the beta pixel, and the calculation formula is as follows:
Figure BDA0003556536620000123
in the formula PβAnd RβRepresenting the accuracy and recall of the boundary pixels within the beta pixel, respectively, and the beta value of this experiment was set to 3. F1-score-b can be used as an evaluation standard of boundary segmentation quality, and edge extraction is carried out on segmentation results and mark truth values to obtain a segmentation boundary with the pixel width of 1.
In order to verify the effectiveness of the segmentation network adopted by the invention, the segmentation network is connected with U-Net, Deeplabv3+ and U2-Net and RAUNet for comparison. Wherein U-Net and Deeplabv3+ are classic semantic segmentation methods; RAUNet adopts coding-decoding structure, designs Attention enhancing Module (AAM) to fuse multi-level features and capture global context information; u shape2Net utilizes two-stage nested U-blocks and designed ReSidual U-blocks (RSU) to enable the network to capture richer feature information from shallow and deep layers. In the experimental process, comparison models such as U-Net are realized by using the published source codes, and all models are retrained on two sea and land segmentation data sets for comparison.
The experiment adopts a Pythroch machine learning framework under Windows, and the hardware environment is CPU Inter (R) XeoneE 2176G, GPU GTX 2080Ti and 11G video memory. Each network experiment was performed in the same environment, with each training parameter kept consistent, the Adam optimizer was selected, the batch size (batch size) was set to 4, the initial learning rate was set to 0.0001, and a total of 50 trains were performed.
The following compares the prediction results of the segmented network proposed by the present invention and the compared network on two sets of data sets. In order to more comprehensively compare the sea and land segmentation capabilities of various methods, two typical scenes are selected from the Data set 1(Data1) and are analyzed, and the two typical scenes are respectively marked as scene 1 and scene 2; two representative scenes from dataset 2(Data2) were selected for analysis, denoted as scene 3 and scene 4, respectively. Each scene image, label, and each network prediction result are shown in fig. 5, which is a scene 1, a scene 2, a scene 3, and a scene 4 from left to right. In addition, the boundary of the sea and land division of the network is overlapped with the original image, so that the effectiveness of the network in the sea and land division task is more visually shown.
Scene 1 is used to explore the extraction capability of each network at the weak sea-land boundaries such as silt and estuary, as shown in the leftmost column of images in fig. 5, the sea-land boundaries with a large amount of silt, U-Net, deeppabv 3+, U-Net, and upper white frame2The network and the RAUNet can not classify the sea and land correctly, and the segmentation network overcomes the interference of silt and obtains an accurate segmentation result at the boundary of the weak sea and land. The lower white box contains a large number of estuaries, and the part between the estuaries and the first bridge is conventionally defined as the sea area, U-Net, Deeplabv3+, U2Net completely subdivides the area into land, RAUNet only correctly classifies partial sea areas, and the sea and land segmentation result of the segmentation network at the river mouth is more consistent with the convention. Scene 2 includes a hydraulic structure with a slender structure such as a harbor and a breakwater, and is used for detecting the sea-land segmentation capability of the network on an artificial coast with a complex boundary, as shown in the second row of images on the left side of fig. 5, as can be seen from white frames, U-Net and RAUNet can extract parts of the breakwater area, but the segmentation result is still not accurate enough, and U-Net2Net and Deeplabv3+ completely divide the breakwater into sea areas in a wrong way, which shows that fusing high-level and low-level semantic information can improve the network segmentation performance to a certain extent, but is not ideal for boundary and detail processing. The network of the invention uses a deep supervision strategy in the training process, effectively retains the edge detail information of the breakwater, extracts relatively complete and continuous sea-land boundaries in artificial coast areas with complex boundaries, and has a segmentation result obviously superior to other networks. Particularly, the left white frame area of the second row image on the left side of fig. 5 is surrounded by the port, so that the networks such as U-Net, depllabv 3+ ignore the sea area characteristics of the area and wrongly divide the area into the land, and the dividing network of the invention can distinguish the port and sea area categories and obtain more accurate sea and land dividing results.
A scene 3 is a rock coast area and is used for exploring the land and sea segmentation capability of each network under the complex land background, as shown in a third column of image white circle area on the left side of fig. 5, the land of the area contains interference factors such as rocks and vegetation, so that the land and sea segmentation background is complex, and the segmentation result of the area has an adhesion phenomenon because the contrast networks such as U-Net cannot extract deeper semantic information; the segmentation network of the invention introduces an attention mechanism to strengthen the characteristics of the sea-land boundary, and obtains a complete and accurate sea-land segmentation result in a coastal region with a complex background environment. The scene 4 is a coast image including a plurality of islands for detecting the land and sea division capability of the network under the condition of complex shape of the water line. As shown in the rightmost image of fig. 5, the sea-land boundary of the island is usually obvious, and the segmentation result of each network in this type of image is obviously better than that in other scenes, but the island is irregular in shape and size, which easily causes the phenomenon of a severe edge depression (white circle region) on the water line, and each comparative network is difficult to process the complex boundary information in this region, and the phenomenon of classification error occurs. The network of the invention extracts the multi-scale information of the image by using Res2Net, makes up the ambiguity caused by the insufficient information of the local area, overcomes the interference caused by the water line recess, obtains more accurate sea and land segmentation results, and extracts clearer and more continuous sea and land boundaries.
The results of evaluation of each network on two data sets are shown in Table 4, with data set 1, U-Net and U2F1-score and other regions obtained by-Net have lower segmentation precision, while U2-Net employs a depth supervision strategy such that the F1-score-b value is higher than U-Net. The Deeplabv3+ region segmentation precision is higher than that of U-Net and U2Net, but F1-score-b is lower because deepabv 3+ does not fully utilize the detailed information of the shallow feature map. Raunet achieves the sub-optimal value of each network segmentation result in the data set 1 with the F1-score, MIOU and MAE respectively being 98.38%, 96.82% and 0.016, and the F1-score-b being 69.25%. The evaluation indexes of the segmented network in the data set 1 are superior to those of other networks, F1-score and MIOU are improved by 0.71 percent and 1.37 percent compared with the suboptimal value obtained by RAUNet, MAE is reduced to 0.009, and in addition, the F1-score-b value of the network in the data set 1 reaches 80.14 percent, and the suboptimal value is improved by 10.89 percent.
In data set 2, U-Net and U2-Net has a lower accuracy of region segmentation than it doesHis network. Deeplabv3+ achieved sub-optimal values for F1-score, MIOU, and F1-score-b in dataset 2, whereas RAUNet had poor edge segmentation capability for this dataset, and F1-score-b was only 49.77%, much lower than other networks. The evaluation indexes of the segmentation network in the data set 2 are optimal, F1-score and MIOU are improved by 0.42% and 0.83% compared with suboptimal values, MAE is reduced by 0.187, the precision of the edge F value reaches 72.86%, and the precision of the edge F value is improved by 17.81% compared with suboptimal values.
TABLE 4
Figure BDA0003556536620000151
In order to further study the effect of each module in the segmentation model of the present invention, the segmentation model of the present invention is split, and the effect of Res2Net module, the effect of depth supervision policy (DS) module and the effect of SA module in the model are verified through ablation experiments, respectively, where table 5 shows the F1-score-b value when performing ablation experiments on dataset 1, where the basic network is a coding-decoding (En _ Decoder) network structure, and the convolution layer when performing ablation on Res2Net is replaced with the Res Net module.
TABLE 5
Figure BDA0003556536620000152
Figure BDA0003556536620000161
The result shows that when the segmentation network of the invention uses the Res2Net module, the boundary segmentation precision is improved by 10.45 percent compared with the ResNet module; after the deep supervision strategy is adopted, the boundary segmentation precision is improved by 1.03%; the network (MSDSONT) added with the SA module has the segmentation boundary precision of 80.14 percent, improves the segmentation boundary precision by 5.03 percent and achieves the optimal precision value of each structure, thereby proving the necessity of each module in the sea and land segmentation task.
From the above analysis, it can be seen that: the attention module can improve the characteristic response of the weak sea-land boundary and has advantages in the weak sea-land boundary extraction; the deep supervision strategy can enhance the capability of learning the target boundary information by the network and improve the accuracy of edge segmentation. Therefore, the network of the invention can be suitable for the sea and land segmentation of different types of remote sensing images, and can obtain the optimal result in both the region and the edge detection result.

Claims (8)

1. A remote sensing image land and sea segmentation method is characterized by comprising the following steps:
1) obtaining a remote sensing image, and performing label making on a remote sensing image data set to form corresponding training set data;
2) establishing a remote sensing image sea-land segmentation model, wherein the sea-land segmentation model adopts an encoder and a decoder structure, the encoder adopts a plurality of layers of encoding modules, and each layer of encoding module is used for extracting different scale characteristics of the remote sensing image; the decoder comprises a plurality of layers of decoding modules corresponding to the plurality of layers of coding modules, each layer of decoding module is used for up-sampling the output characteristics of the corresponding layer of coding module and the output characteristics of the previous layer of coding module to the size of an original image, fusing the characteristics processed by each layer of decoding module and carrying out edge detection on the fused characteristic image;
3) training the established remote sensing image sea-land segmentation model by using the training set data, respectively training each layer of decoding module by adopting a depth supervision strategy during training, and constructing a total loss function of the remote sensing image sea-land segmentation model according to the loss function of each layer of decoding module;
4) and acquiring the remote sensing image to be segmented, and inputting the remote sensing image to be segmented into the trained remote sensing image sea-land segmentation model so as to realize the sea-land segmentation of the remote sensing image to be segmented.
2. The remote-sensing image sea-land segmentation method according to claim 1, wherein the multilayer coding modules adopt a Res2Net network and comprise 5 layers of coding modules, the first layer of coding module comprises a convolution layer and a maximum pooling layer and is used for extracting features of the input remote-sensing image, and the other layers of coding modules all adopt residual blocks and are used for processing output results of the previous layer of coding module.
3. A remote sensing image land and sea segmentation method as claimed in claim 2, characterized in that the residual Block is Res2_ Block.
4. The remote-sensing image sea-land segmentation method according to claim 2, wherein the remote-sensing image sea-land segmentation model further comprises a compression and attention module, and an output result of each layer of coding module is processed by the corresponding compression and attention module.
5. A remote sensing video sea-land segmentation method as claimed in claim 2, characterized in that each layer decoding module comprises a Res2_ Block and an Upsample for gradually restoring the feature map to the original input image size.
6. The method for sea-land segmentation of remote-sensing images according to any one of claims 1 to 5, wherein the loss function of the model for sea-land segmentation of remote-sensing images during training is as follows:
Figure FDA0003556536610000021
wherein M is equal to the number of layers of the decoding module and is also the number of layers of the encoding module;
Figure FDA0003556536610000022
represents a loss function of the mth layer decoding module; lfuseRepresenting the loss function after the fusion of the decoding modules of each layer;
Figure FDA0003556536610000023
and WfuseRespectively representing the weight of the loss function of the m-th layer decoding module and the weight of the fused loss function.
7. The remote sensing image sea-land segmentation method according to claim 6, wherein the loss function of each layer of decoding module and the loss function after fusion both adopt BCE loss functions of semantic segmentation binary classification tasks.
8. The remote-sensing image land and sea segmentation method according to claim 7, wherein the BCE loss function is:
Figure FDA0003556536610000024
wherein (r, c) represents the coordinates of the pixel point, H, W represents the height and width of the image, PG(r,c)And PS(r,c)Respectively representing the true value and the predicted value of the pixel point.
CN202210280187.XA 2022-03-21 2022-03-21 Remote sensing image land and sea segmentation method Pending CN114663439A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210280187.XA CN114663439A (en) 2022-03-21 2022-03-21 Remote sensing image land and sea segmentation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210280187.XA CN114663439A (en) 2022-03-21 2022-03-21 Remote sensing image land and sea segmentation method

Publications (1)

Publication Number Publication Date
CN114663439A true CN114663439A (en) 2022-06-24

Family

ID=82031286

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210280187.XA Pending CN114663439A (en) 2022-03-21 2022-03-21 Remote sensing image land and sea segmentation method

Country Status (1)

Country Link
CN (1) CN114663439A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116342616A (en) * 2023-03-15 2023-06-27 大连海事大学 Remote sensing image sea-land segmentation method based on double-branch integrated learning
CN116594061A (en) * 2023-07-18 2023-08-15 吉林大学 Seismic data denoising method based on multi-scale U-shaped attention network
CN117312471A (en) * 2023-09-26 2023-12-29 中国人民解放军91977 部队 Sea-land attribute judging method and device for massive position points
CN117635628A (en) * 2024-01-23 2024-03-01 武汉理工大学三亚科教创新园 Sea-land segmentation method based on context attention and boundary perception guidance

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116342616A (en) * 2023-03-15 2023-06-27 大连海事大学 Remote sensing image sea-land segmentation method based on double-branch integrated learning
CN116342616B (en) * 2023-03-15 2023-10-27 大连海事大学 Remote sensing image sea-land segmentation method based on double-branch integrated learning
CN116594061A (en) * 2023-07-18 2023-08-15 吉林大学 Seismic data denoising method based on multi-scale U-shaped attention network
CN116594061B (en) * 2023-07-18 2023-09-22 吉林大学 Seismic data denoising method based on multi-scale U-shaped attention network
CN117312471A (en) * 2023-09-26 2023-12-29 中国人民解放军91977 部队 Sea-land attribute judging method and device for massive position points
CN117312471B (en) * 2023-09-26 2024-05-28 中国人民解放军91977部队 Sea-land attribute judging method and device for massive position points
CN117635628A (en) * 2024-01-23 2024-03-01 武汉理工大学三亚科教创新园 Sea-land segmentation method based on context attention and boundary perception guidance
CN117635628B (en) * 2024-01-23 2024-04-09 武汉理工大学三亚科教创新园 Sea-land segmentation method based on context attention and boundary perception guidance

Similar Documents

Publication Publication Date Title
CN114663439A (en) Remote sensing image land and sea segmentation method
CN114119582B (en) Synthetic aperture radar image target detection method
CN109934200B (en) RGB color remote sensing image cloud detection method and system based on improved M-Net
CN112183258A (en) Remote sensing image road segmentation method based on context information and attention mechanism
CN112766087A (en) Optical remote sensing image ship detection method based on knowledge distillation
CN111753677B (en) Multi-angle remote sensing ship image target detection method based on characteristic pyramid structure
CN112489054A (en) Remote sensing image semantic segmentation method based on deep learning
CN116721112B (en) Underwater camouflage object image segmentation method based on double-branch decoder network
CN112560865B (en) Semantic segmentation method for point cloud under outdoor large scene
CN113255837A (en) Improved CenterNet network-based target detection method in industrial environment
CN116343045B (en) Lightweight SAR image ship target detection method based on YOLO v5
CN113052180A (en) Encoding and decoding network port image segmentation method fusing semantic flow fields
CN115512103A (en) Multi-scale fusion remote sensing image semantic segmentation method and system
CN114973011A (en) High-resolution remote sensing image building extraction method based on deep learning
CN113392711A (en) Smoke semantic segmentation method and system based on high-level semantics and noise suppression
CN112037225A (en) Marine ship image segmentation method based on convolutional nerves
CN114821069A (en) Building semantic segmentation method for double-branch network remote sensing image fused with rich scale features
Liu et al. Two-stage underwater object detection network using swin transformer
CN114757864A (en) Multi-level fine-grained image generation method based on multi-scale feature decoupling
CN114067162A (en) Image reconstruction method and system based on multi-scale and multi-granularity feature decoupling
CN115457568A (en) Historical document image noise reduction method and system based on generation countermeasure network
CN116935332A (en) Fishing boat target detection and tracking method based on dynamic video
CN112330562A (en) Heterogeneous remote sensing image transformation method and system
CN114937154B (en) Significance detection method based on recursive decoder
CN116503755A (en) Automatic recognition analysis method for shoreline remote sensing based on cloud platform and deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination