CN115862010B

CN115862010B - High-resolution remote sensing image water body extraction method based on semantic segmentation model

Info

Publication number: CN115862010B
Application number: CN202211102819.XA
Authority: CN
Inventors: 陈冬花; 李虎; 邹陈; 汪左; 刘赛赛; 谢以梅
Original assignee: Anhui Normal University; Chuzhou University
Current assignee: Anhui Normal University; Chuzhou University
Priority date: 2022-09-09
Filing date: 2022-09-09
Publication date: 2023-09-05
Anticipated expiration: 2042-09-09
Also published as: CN115862010A

Abstract

The invention discloses a high-resolution remote sensing image water body extraction method based on a semantic segmentation model, which relates to the technical field of remote sensing images and comprises the following steps of: acquiring remote sensing image data; extracting the water spectrum characteristics of the remote sensing image data; combining the water spectrum characteristics with HH and HV dual polarization mode characteristics to construct a water sample data set; constructing an improved deep LabV3+ network model; performing semantic segmentation training on the improved deep LabV3+ network model according to the water sample data set, and adjusting parameters in the improved deep LabV3+ network model to obtain an optimal deep LabV3+ network model; and extracting the water body information through an optimal deep LabV3+ network model. According to the invention, the GF-3SAR image and the GF-6 optical remote sensing image are combined for the first time, a water body sample data set is manufactured, and improvement and optimization of the deep LabV3+ network are realized by replacing an improved backbone network, a loss function type and setting different learning values, so that a powerful technical support is provided for water environment problem research under a complex background.

Description

High-resolution remote sensing image water body extraction method based on semantic segmentation model

Technical Field

The invention relates to the technical field of remote sensing images, in particular to a high-resolution remote sensing image water body extraction method based on a semantic segmentation model.

Background

The remote sensing extraction of the water body information plays a very important role in urban planning, disaster prevention and control, land development, crop growth and the like and in the aspects of human production and life, wherein the water body information plays a very important role in aspects of water resource investigation, flood disaster assessment, water environment protection, water body dynamic change analysis and the like.

The current water body remote sensing extraction is limited by the spatial resolution of the remote sensing image, and the phenomena of 'same object and different spectrum' and 'foreign object and same spectrum' are caused by the interference of topography and human factors, so that the problems of low automation degree, poor precision and the like of the water body extraction result are solved, the water body information is difficult to extract accurately in time, and a great challenge is provided for observing the water body information in a long time sequence. With the networking of high-resolution remote sensing satellites in China, the spatial resolution, the spectral resolution and the time resolution of the remote sensing images are greatly improved, so that the application of high-resolution remote sensing means to water information extraction is particularly important, and a feasible technical means is provided for accurately and efficiently extracting the water information.

The water remote sensing extraction algorithm is subjected to the development histories of a classical threshold segmentation method, a water index method, an object-oriented method and a machine learning algorithm, and along with the gradual penetration of the water extraction algorithm, the classical water extraction method is difficult to meet the requirements on precision and efficiency, and a new technology is needed to solve the new direction and difficulty of solving the rapid and high-precision production requirement.

Along with the progress of high-resolution remote sensing technology in China, the variety and quantity of remote sensing data are increasingly abundant, the image quality and resolution are obviously improved, and how to effectively process and analyze information under a big data background is a key of remote sensing technology application. The remote sensing image semantic segmentation method based on the convolutional neural network can effectively extract water body information, and achieves excellent effects in efficiency and precision. However, semantic segmentation based on remote sensing images still faces a great challenge, and the problem of losing detail information of images is easily caused in the downsampling process; deepening of a network structure and expanding of model parameters bring about very complex calculation amount and higher requirements on computer hardware performance; and the existing public data sets meeting the requirement have few problems. Therefore, how to select a proper network structure, reduce the calculated amount, improve the calculation performance, and explore a network model with high precision and good generalization performance is a problem to be solved urgently in the current deep learning semantic segmentation.

Disclosure of Invention

The embodiment of the invention provides a high-resolution remote sensing image water body extraction method based on a semantic segmentation model, which can solve the problems in the prior art.

The invention provides a high-resolution remote sensing image water body extraction method based on a semantic segmentation model, which comprises the following steps of:

acquiring remote sensing image data;

extracting the water spectrum characteristics of the remote sensing image data;

combining the water spectrum characteristics with HH and HV dual polarization mode characteristics to construct a water sample data set;

replacing the Xportion backbone network in the deep LabV3+ network model with an improved Aligned Xportion backbone network, and constructing an improved deep LabV3+ network model, wherein the Aligned Xportion backbone network specifically comprises the following improvements:

increasing the number of network layers to deepen the network for extracting rich semantic information in the image;

replacing the maximum pooling layer of all stride=2 in the original Xreception network Entry flow by using a depth separable convolution of stride=2;

after the 3×3 depth separable convolution, the original 8 operations are replaced in the middle flow by repeating 16 times by adding BN and ReLu activation functions;

performing semantic segmentation training on the improved deep LabV3+ network model according to the water sample data set, and adjusting parameters in the improved deep LabV3+ network model to obtain an optimal deep LabV3+ network model;

and extracting the water body information through the optimal deep LabV3+ network model.

Preferably, the parameter adjustment in the modified deep labv3+ network model includes setting different loss functions and learning rates.

Preferably, the remote sensing image data is high-resolution six-satellite GF-6 remote sensing image data, and the GF-6 remote sensing image data is preprocessed before the water spectrum features are extracted; the GF-6 remote sensing image preprocessing comprises radiometric calibration, atmospheric correction, orthographic correction and image fusion.

Preferably, resampling is performed on the preprocessed GF-6 remote sensing image data so that the spatial resolutions of GF-6 and GF-3 are the same.

Preferably, the dual polarization modes are HH and HV dual polarization modes of the high-resolution satellite GF-3.

Preferably, the water body spectral feature is the NIR band.

Preferably, feature combination is performed on the water spectrum features and the HH and HV dual polarization mode features, and the construction of the water sample data set specifically comprises the following steps:

carrying out water morphological feature analysis on the water spectrum feature and the feature combination of HH and HV dual polarization modes to construct an image data set;

extracting coarse water body information according to the image data set;

refining the crude water body information;

obtaining a surface layer of water body and non-water body information according to the refined crude water body information;

stacking the surface map on the image data set to generate a grid label image;

cutting the grid tag images in batches to obtain sample tag subsets, and cutting the image data sets according to the sample tag subsets to obtain water body sample image sets;

screening the cut sample label subset to manufacture a water sample label set;

and expanding the water body sample image set and the water body sample label set to finally obtain a water body sample data set.

Preferably, the water sample data set is prepared according to 6:2:2 and respectively endowing a training set, a verification set and a test set.

Compared with the prior art, the invention has the beneficial effects that:

according to the invention, the GF-3SAR image and the GF-6 optical remote sensing image are combined for the first time, a water body sample data set is manufactured, the best network performance under different super parameter setting combinations is found by replacing an improved backbone network, a loss function type and setting different learning values, the improvement and optimization of the deep LabV & lt3+ & gt network are realized, and powerful technical support and data support are provided for water environment problem research under a complex background. By comparing the multiple angles of the water extraction type, the result precision, the efficiency and the like with a threshold segmentation method, an NDWI water index method, an SVM classification method and the like of a classical remote sensing water extraction method, the method can accurately identify the information of large water bodies (such as Yangtze river basin), medium water bodies (rivers, lakes and the like) and small water bodies (farmland ditches and the like) in a research area, and has the advantages that the performances in all aspects are higher than those of the classical water extraction method.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is an overall flow diagram of the present invention;

FIG. 2 is a DeepLabV3+ semantic segmentation network diagram of the present invention;

FIG. 3 is a diagram of an improved Aligned Xreception network of the present invention;

FIG. 4 (a) is a graph showing the result of extraction of water in Ulva city by the method of the present invention;

fig. 4 (b) is a graph of the extraction result of the threshold segmentation method on the water body of the turnip lake city;

FIG. 4 (c) is a graph showing the result of NDWI body of water extraction on the Ulva market by the water index method;

FIG. 4 (d) is a graph showing the result of SVM classification on the water body of the Ulva city;

FIG. 5 (a) is a graph showing the result of the method of the present invention for the extraction of water in Yangtze river basin;

FIG. 5 (b) is a graph of the result of threshold segmentation on the Yangtze river basin water;

FIG. 5 (c) is a graph of the result of NDWI body of water extraction in the Yangtze river basin by the water index method;

FIG. 5 (d) is a graph of the SVM classification method for extracting the water body in the Yangtze river basin;

FIG. 6 (a) is a graph showing the result of the method of the present invention for extracting medium-sized water bodies such as rivers, lakes, etc.;

FIG. 6 (b) is a graph showing the result of threshold segmentation on medium-sized water bodies such as rivers and lakes;

FIG. 6 (c) is a graph showing the result of NDWI body index method on medium-sized water bodies such as rivers, lakes, etc.;

FIG. 6 (d) is a graph showing the result of SVM classification on medium-sized water bodies such as rivers, lakes, etc.;

FIG. 7 (a) is a graph showing the result of the method of the present invention for extracting fine water from farmlands, ditches, etc.;

FIG. 7 (b) is a graph showing the result of the threshold segmentation method for extracting fine water bodies such as farmlands, ditches and the like;

FIG. 7 (c) is a graph showing the result of NDWI water body index method on fine water bodies such as farmlands, ditches and the like;

FIG. 7 (d) is a graph showing the result of SVM classification on fine water bodies such as farmlands, ditches and the like.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1-7, the invention provides a high-resolution remote sensing image water body extraction method based on a semantic segmentation model, which comprises the following steps:

the first step: and obtaining GF-3 remote sensing image data, GF-6 remote sensing image data and DEM data.

According to the invention, 2-scene GF-3 fine strip 2 imaging mode FSII data and 5-scene GF-6PMS sensor data are obtained from a Chinese resource satellite application center, wherein the GF-3FSII mode data adopts HH and HV dual polarization modes, the image space resolution is up to 10m, and the breadth is up to 100km. After the single-band data and the multi-band data of GF-6PMS data are fused, the spatial resolution can reach 2 meters, and the breadth reaches 90km. The GF-3FSII mode data imaging time is 7 months and 24 days in 2020. The GF-6PMS image acquisition time is concentrated around 9 months in 2020, 4 days, the imaging quality of the selected images is good, the cloud content is basically kept below 5%, and the product number, the center longitude and latitude and the imaging time for acquiring the remote sensing images are shown in Table 1.

TABLE 1 remote sensing image data basic information

The invention obtains digital elevation model Data (DEM) from the geospatial data cloud, the breadth is 1 degree multiplied by 1 degree, and the data resolution is 30m.

Preprocessing GF-6 remote sensing image data. The high-resolution sixth image preprocessing flow comprises the following steps: radiometric calibration, atmospheric correction, orthographic correction, and image fusion.

Radiometric calibration converts pixel brightness values of an image into a process that reflects the apparent reflectivity of the top layer of the atmosphere. Based on ENVI China domestic satellite plug-in, GF-6 data can be opened, calibration parameters in an image metadata file can be automatically read, and errors caused by a sensor in a remote sensing image are removed.

Atmospheric correction is a process of converting the apparent reflectivity of the top layer of the atmosphere into the true reflectivity of the earth's surface by removing radiation errors caused by atmospheric absorption, scattering, reflection and the like covered in the total radiation brightness received by the sensor. The invention adopts the FLAASH atmosphere correction module of ENVI5.3, can be automatically obtained from an image metadata file, the sensor type is UNKNOWN-MSI, the sensor height is set to 645km, the average height of an imaging area is obtained based on statistical mean value of cut DEM data, the pixel size is 8m, and an atmosphere model is selected by longitude and latitude information of an obtained image and a research area.

And (3) the process of generating a planar orthographic image by correcting the orthographic image and removing geometrical distortion influence caused by the topography, the sensor and the like. According to the invention, 30mASTER GDEM data is used for the GF-6 panchromatic and multispectral images after the atmospheric correction, more than 13 control points are selected for respectively carrying out the orthographic correction according to the day map non-offset images with the spatial resolution of 0.5m in 91 Wei Tu, and the correction error is ensured to be controlled within 1 pixel.

The image fusion refers to that two or more pieces of image data which cover the same target and have certain redundancy or complementary characteristics are fused according to a specific rule to generate a new image, and the fused image can synthesize information provided by each single image, so that the spatial resolution and the spectral resolution of the image are improved, and the information description on the target is formed more clearly, completely and accurately. According to the invention, an image fusion method of 'NNDiffuse Pan Sharpening' in ENVI5.3 is adopted, a full-color wave band image with high spatial resolution and a multispectral data image with 8m of GF-6PMS sensor 2m are fused to obtain a multispectral image with spatial resolution of 2m and containing four wave bands, the fused image not only can improve the spatial resolution, but also can well retain texture information and spectrum information of multispectral data, and the maximum advantage of the fused image can be achieved.

Since the spatial resolution of the GF-3FSII imaging mode is 10m and the spatial resolution of the data after GF-6PMS image fusion is 2m, resampling of GF-6 image data is required to resample the image data with spatial resolution of 2m to 10m. The invention resamples GF-6PMS remote sensing image Data with the spatial resolution of 2m to Data with the spatial resolution of 10m by using a 'resolution Data' tool in ENVI5.3 and adopting a three-time convolution interpolation method.

And a second step of: and extracting the water spectrum characteristics of the pretreated GF-6 remote sensing image data.

And extracting the water spectrum characteristics of the pretreated GF-6 remote sensing image data. The mathematical algebra method is generally based on numerical operation to extract spectral characteristic information in an image, and common algorithms include a gray value statistics method, a water body index method and a threshold method, specifically see table 2, wherein B is the reflectivity of a blue light band, G is the reflectivity of a green light band, R is the reflectivity of a red light band, NIR is the reflectivity of a near infrared band, MIR is the reflectivity of a middle infrared band, SIR is the reflectivity of a short wave infrared band, and N is a threshold.

TABLE 2 general Water index summary Table

Based on GF-6PMS remote sensing image data, the near infrared band (NIR) of the image is extracted as the spectral characteristic of the water body by utilizing the spectral curve characteristic of the water body.

And a third step of: combining the spectral characteristics of the water body with two polarization modes of GF-3 to synthesize a 3-band image, obtaining a characteristic combination SF+HH+HV, and constructing a water body sample data set.

The combination scheme is used for carrying out the space geometric feature analysis of the water body according to different forms of the water body, an image set of feature combination is constructed, and in the image semantic segmentation network training process, the image ground feature elements are required to be consistent with corresponding label data. Based on ArcGIS, ENVI5.3, 91 Wei Tu and python tools, SF+HH+HV is manually marked with water and non-water information, and sample tag data is produced. The specific implementation steps are as follows:

(1) And extracting the crude water body. Because the geographic coordinates of the image sets of the three feature combinations are consistent, and the positions of all the feature elements in the images are relatively consistent, the invention uses Arcgis software to vector the approximate range of the water body by taking the GF-3HV polarization mode as a base map, and the approximate coarse range of the water body is obtained by vectorization.

(2) And (5) refining the crude water body information. The vectorized coarse water body range is imported into 91 Wei Tu software for coarse water body refinement, a historical non-offset daily map image is utilized, and the time consistent with or similar to the image acquisition time period is selected, and as the resolution of the image can reach 0.5 meter, the aim of coarse water body refinement can be achieved through visual interpretation.

(3) And giving attribute category information of the water body and the non-water body. And importing the water body information subjected to the refinement in the 91-dimensional graph into Arcgis software, and endowing the attribute field value of the water body category with 1. And extracting boundary information of the image by using an Arcgis tool, combining all water surfaces into one surface, cutting the boundary information and the water surface information to obtain non-water information, giving an attribute field value of 0, and updating the water information on the non-water information by using an Arcgis updating tool to obtain a surface layer containing water and non-water information.

(4) Further confirming that the generation of the raster image is completed. And (3) importing the vector surface layer obtained in the last step into an ENVI, superposing the surface layer on 3 different characteristic combined images, further confirming the correctness of water labeling, and if the labeling is correct, converting the vector surface into a TIF grid image by using an element surface-turning tool in Arcgis. An attribute field value of 1 in the raster image represents water information, and 0 represents non-water information.

(5) Cutting in batches. Because the input requirement of the convolutional neural network is a square image generally, and the spatial resolution of the image data set is 10m, the programming language python is utilized to cut the manufactured grid label image in batches into tiles with the size of 128 multiplied by 128, and simultaneously the remote sensing image data sets with different characteristic combinations are cut in a corresponding mode and are respectively stored as a water body sample image set of experiment one, experiment two and experiment three.

(6) And (3) screening the sample tag subsets obtained by cutting in the step (5) so that each screened sample tag subset contains water body information, namely a tag sample with an attribute value of 1 and contains water bodies with various characteristic forms, thereby completing the manufacture of the water body sample tag set. Based on the python programming language, 6119 image samples and label samples of 128×128 size are obtained by clipping.

According to the invention, a certain data amplification means is selected to expand the water body sample data set, and two technical means of rotation and horizontal and vertical overturning are mainly adopted, so that noise data in a sample is increased to improve the robustness of a model. According to 6:2:2 and respectively endowing a training set, a verification set and a test set. The network model firstly carries out training and learning on a training set so as to carry out data fitting, then adjusts parameters in a network through a verification set so as to control the complexity of a network structure, and finally utilizes a test set to test the performance of the network model and evaluate the generalization capability of a final model.

Fourth step: an improved deep labv3+ network model was constructed. The deep LabV3+ semantic segmentation network restores the size of the output feature image to the original image size by utilizing the upsampling structure, and simultaneously compensates the problem of losing space position information by carrying out feature fusion on the bottom layer features of the image and the obtained high-level semantic features, thereby improving the accuracy of semantic segmentation boundary information. The deep labv3+ semantic segmentation network is shown in fig. 2.

The encoding region, namely the Encoder part, extracts initial features of the image through a depth separable convolutional neural network (DCNN), transmits the extracted initial features to pyramid pooling modules (Atrous Spatial Pyramid Pooling, ASPP) with cavity convolution, extracts context information of the image by utilizing different expansion convolution rates, acquires a multi-scale feature map of the image, and finally passes through a 1X 1 convolution layer. The main purpose of the Encoder section is to reduce feature mapping in the image, extract deep semantic features in the image, including multi-scale context information.

The decoding area, namely the Decoder part, mainly carries out 4 times up-sampling on the feature image output by the Encoder part, stacks the feature image output by the DCNN with the bottom feature image output by the DCNN after carrying out 1X 1 convolution for dimension reduction, and then restores the spatial resolution and the spatial position information of the image through 4 times up-sampling. The main purpose of the Decoder section is to recover the spatial information of the image by upsampling, thereby enabling the capture of clear and accurate target boundary information in the image.

According to the invention, the backbone network in the deep LabV3+ network model is improved, the Xception (Extreme Inception) network is one of the backbone networks in the deep LabV3+ semantic segmentation network, and is based on improvement of the InceptionV3 network structure. The method comprises the steps that convolution operation is carried out on an InceptionV1 by adopting 3 convolution kernels with the sizes of 1 multiplied by 1, 3 multiplied by 3 and 5 multiplied by 5, all operation results are finally cascaded through the maximum pooling operation of 3 multiplied by 3 and are transferred to a next InceptionV1 structure, and the structure realizes image dimension reduction processing by adding 1 multiplied by 1 convolution and captures clear boundary information in an input image.

The Aligned Xreception is an improved network based on the Xreception network, and the structure of the Aligned Xreception network after body improvement is shown in figure 3. Firstly, increasing the number of network layers to deepen a network, and extracting rich semantic information in an image; secondly, replacing the maximum pooling layer of all stride=2 in the original Xreception network Entry flow by depth separable convolution of stride=2; finally, after the separable convolution with the depth of 3 multiplied by 3, the original 8 operations are replaced by adding BN and ReLu activation functions repeatedly for 16 times in the middle flow, so that the network performance is optimized, and the network segmentation precision is improved.

Fifth step: and carrying out semantic segmentation training on the improved deep LabV3+ network model according to the water sample data set, and adjusting parameters in the deep LabV3+ network model to obtain an optimal deep LabV3+ network model.

The Accuracy evaluation index is used to evaluate network performance, including Accuracy (Accuracy), average-to-average ratio (Mean Intersection over Union, MIoU) and frequency-to-weight ratio (Frequency Weighted Intersection over Union, FWIoU). Accurcy refers to the ratio of the number of correctly predicted pixels to the total number of pixels, and represents the probability of correctly classifying pixels. The larger the Accuracy value is, the larger the approximation degree of the predicted value and the true value is, namely, the higher the network precision is. The specific formula is as follows:

IoU is the ratio of the intersection and union of the true value and the predicted value, and when IoU is greater than 0.5, the network segmentation performance is good ^[77] . MIoU is the average of all cross-ratios in IoU.

MIoU represents the approximation degree of the model predicted value and the true value, the MIoU takes the value range of [0,1], wherein the larger the value is, the more accurate the result of model prediction is represented.

FWIoU is the sum of the weights multiplied by IoU for each class by assigning a corresponding weight to the frequency of occurrence of each class in the image.

The invention adopts the improved backbone network to experiment the data set of the characteristic combination scheme, and compares the influence of the improved Aligned Xreception backbone network on the network segmentation precision. The parameters in the deep LabV3+ network model are adjusted, the learning rate in the deep LabV3+ network model is found to be 0.006, the Accuracy, MIoU index value and the FWIOU3 index value are the highest when the loss function selects the cross entropy loss function ce, the model segmentation effect is the most accurate, and the water body information segmentation is the best.

Sixth step: and extracting the water body information through an optimal deep LabV3+ network model.

The water body extraction method based on the deep LabV3+ model, the threshold segmentation method, the NDWI water body index method and the SVM classification method of the improved Aligned Xreception backbone network is respectively carried out, and the water body extraction method is compared with 3 evaluation indexes and the consumed time in accuracy, homogeneous intersection ratio, frequency-weighted intersection ratio and the consumed time.

As can be seen from the comparison of the results of the extraction of the water bodies in the Ulva city in FIG. 4, the above four methods can extract the whole outline of the water body information in the Ulva city, but the extraction of the water body information in the local detail is different, and the invention mainly performs comparison analysis from three types of angles of large water bodies in the Yangtze river basin, medium water bodies in the river and lake, and small water bodies in the farmland ditch. From fig. 5, it is seen that in the extraction of the large-scale water body information in the Yangtze river basin, the above four methods can accurately extract the water body region. However, in the figure, the interference of the building exists, and the classical water body extraction method is influenced by the building, so that when the water body information is extracted, or the information of partial buildings and the like can be extracted, the interference of the building can be effectively avoided, and the extracted water body information is more accurate. As can be seen from fig. 6, for medium-sized bodies of water, the difference in the details of the water extraction is significant. In the central river curve of the graph, the threshold segmentation method and the SVM classification method can accurately extract water, but the result has a large amount of background ground feature information, while the NDWI water index method has partial loss in the river region. For the extraction of the fine water bodies such as farmland ditches, as can be seen from fig. 7, the images contain fine water body information such as farmland pits and the like, and the upper parts of the images contain ground objects such as building groups and the like, the extraction results of the NDWI water body index method and the SVM classification method have serious water body missing phenomenon, the river in the extraction results is cut off, the threshold segmentation method is severely interfered by the buildings on the extraction of the fine water body information, and a large amount of interference information appears at the lower left corner, namely the upper part, in the extraction result images.

The Accuracy of the water body extraction result is compared and analyzed by using 3 evaluation indexes of Accuracy Accuracy, average cross-over ratio MIoU and frequency-weight cross-over ratio FWIOU, and the results are shown in Table 3.

Table 3 4 comparison of different method accuracies

As can be seen from the table, the method has obvious advantages in Accuracy, MIoU and FWI 3 evaluation indexes, because the 3 water sample data sets finally established by the method all adopt the spatial resolution of 10m, compared with the results of the large, medium and small water extraction information, the method, the threshold segmentation method, the NDWI water index method and the SVM classification method have better precision for large water extraction, but the results of the small and medium water extraction are different, especially for small water such as farmlands and ditches, the images are only represented as a plurality of pixels, so when the type of water information is extracted, the water boundary outline cannot be accurately extracted due to the limitation of the resolution, the water boundary information cannot be accurately extracted due to the existence of mixed pixels by the NDWI water index method, and meanwhile, the threshold segmentation method cannot accurately distinguish the water and certain background object information due to the existence of 'foreign matter identical spectrum' in the images. The SVM classification method is not ideal in extracting a water body result by utilizing the SVM classification method because the manually selected samples are required to be selected and the manually selected samples cannot be guaranteed to be pure pixels due to lower resolution in the image.

Under the background of the current big data age, accurate and efficient extraction of water body information has become the current mainstream development trend. In order to detect the efficiency of the four methods adopted in the experiment of the invention, the efficiency evaluation comparison analysis of the invention is realized by manually recording the calculation time of the method, the threshold segmentation method, the NDWI water body index method and the SVM classification method in the water body information extraction process and according to the degree of human participation in the experiment process, as shown in Table 4.

Table 4 comparison of different process efficiencies

As can be seen from Table 7, the time spent by the method of the present invention on the test set is lower than that of the NDWI water body index method and SVM classification method, but higher than that of the threshold segmentation method. The threshold segmentation method, the NDWI water body index method and the SVM classification method all need to be manually participated in the extraction process, wherein the NDWI water body index method needs to be used for carrying out image band calculation, the threshold segmentation method and the NDWI water body index method all need to be manually continuously debugged, and the SVM classification method needs to be used for manually selecting sample data of water bodies and non-water bodies for training and extracting water body information. The invention utilizes the convolutional neural network to learn and extract deep features of a large amount of sample data, the whole network training process does not need manual participation, the convolutional neural network can simulate the optimal network parameters by itself, and the extraction of water body information can be automatically completed for test set data.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A high-resolution remote sensing image water body extraction method based on a semantic segmentation model is characterized by comprising the following steps of:

acquiring remote sensing image data, including GF-3 remote sensing image data and GF-6 remote sensing image data;

extracting the water body spectral characteristics of the remote sensing image data, wherein the water body spectral characteristics are NIR wave bands;

combining the water spectrum characteristics with HH and HV dual polarization mode characteristics to construct a water sample data set; the dual-polarized mode is a HH and HV dual-polarized mode of a high-resolution third satellite GF-3;

extracting water body information through the optimal deep LabV3+ network model;

combining the water spectrum characteristics with HH and HV dual polarization mode characteristics to construct a water sample data set specifically comprises the following steps:

extracting coarse water body information according to the image data set;

refining the crude water body information;

stacking the surface map on the image data set to generate a grid label image;

screening the cut sample label subset to manufacture a water sample label set;

2. The method for extracting high-resolution remote sensing image water body based on semantic segmentation model according to claim 1, wherein the parameter adjustment in the improved deep labv3+ network model comprises setting different loss functions and learning rates.

3. The method for extracting the water body of the high-resolution remote sensing image based on the semantic segmentation model as claimed in claim 1, wherein the GF-6 remote sensing image data is preprocessed before the extraction of the spectral features of the water body; the GF-6 remote sensing image data preprocessing comprises radiometric calibration, atmospheric correction, orthographic correction and image fusion.

4. The method for extracting the water body of the high-resolution remote sensing image based on the semantic segmentation model according to claim 3, wherein the pre-processed GF-6 remote sensing image data is resampled so that the spatial resolution of the GF-6 remote sensing image data is the same as that of the GF-3 remote sensing image data.

5. The method for extracting the water body of the high-resolution remote sensing image based on the semantic segmentation model as claimed in claim 1, wherein the water body sample data set is prepared according to the following steps: 2:2 and respectively endowing a training set, a verification set and a test set.