CN113255581B

CN113255581B - Weak supervision deep learning water body extraction method and device, computer equipment and medium

Info

Publication number: CN113255581B
Application number: CN202110684292.5A
Authority: CN
Inventors: 方乐缘; 鲁鸣
Original assignee: Hunan University
Current assignee: Hunan University
Priority date: 2021-06-21
Filing date: 2021-06-21
Publication date: 2021-09-28
Anticipated expiration: 2041-06-21
Also published as: CN113255581A

Abstract

The invention discloses a method, a device, computer equipment and a medium for extracting a weakly supervised deep learning water body, which comprises the steps of constructing an initial convolutional neural network, obtaining a remote sensing image training set, and training the initial convolutional neural network according to the remote sensing image training set to obtain a trained convolutional neural network; acquiring an original remote sensing image, and carrying out proximity sampling on the original remote sensing image through a proximity sampler to obtain a group of proximity image groups; inputting the adjacent image group into a trained convolutional neural network to obtain a predicted water body extraction probability map group; and carrying out binarization processing on the predicted water body extraction probability map group to obtain a prediction result group, and voting the prediction result group according to a preset voting principle to determine to obtain a final water body extraction result. Effectively improve the precision of water extraction.

Description

Weak supervision deep learning water body extraction method and device, computer equipment and medium

Technical Field

The invention relates to the technical field of remote sensing image processing, in particular to a method and a device for extracting a weak supervised deep learning water body, computer equipment and a medium.

Background

The water body extraction refers to extracting the surface of a water body from an image, in particular to extracting naturally formed water bodies such as rivers, lakes, seas and the like, artificial water bodies such as paddy fields, reservoirs, canals and the like from a remote sensing image. The traditional remote sensing image water body extraction method generally comprises a single-waveband method, a water body index method, a supervision classification method, an inter-spectrum relation method and other related algorithms. Although the methods make progress in water body extraction, the methods still have the problems of low automation degree, tedious manual features, insufficient extraction precision and the like. Meanwhile, due to rapid progress of the remote sensing technology, surface texture information contained in the high-resolution remote sensing image is more and more refined, edge structure information is more and more abundant, and the traditional remote sensing water body extraction method is difficult to fully utilize high-resolution remote sensing image information with abundant semantics and meet the increasing remote sensing application requirements.

Disclosure of Invention

Aiming at the technical problems, the invention provides a weak supervision deep learning water body extraction method, a device, computer equipment and a medium which can effectively improve the water body extraction precision.

In one embodiment, a method of weakly supervised deep learning water extraction, the method comprising the steps of:

step S100: constructing an initial convolutional neural network, acquiring a remote sensing image training set, and training the initial convolutional neural network according to the remote sensing image training set to obtain a trained convolutional neural network;

step S200: acquiring an original remote sensing image, and carrying out proximity sampling on the original remote sensing image through a proximity sampler to obtain a group of proximity image groups;

step S300: inputting the adjacent image group into a trained convolutional neural network to obtain a predicted water body extraction probability map group;

step S400: and carrying out binarization processing on the predicted water body extraction probability map group to obtain a prediction result group, and voting the prediction result group according to a preset voting principle to determine to obtain a final water body extraction result.

Preferably, the width and height of the original remote sensing image are W and H, respectively, the size of each grid is K × K, and step S200 includes:

step S210: splitting an original remote sensing image into

A grid;

step S220: from

All elements of the grids in the ith row and the j column are selected from the grids and are respectively regarded as the (i, j) th elements of the adjacent images of the adjacent image group in the adjacent sampler, and until all grid sampling is completed, a group of adjacent image groups are generated, wherein,

，

。

preferably, step S100 includes:

step S110: constructing an initial convolutional neural network, wherein the structure of the initial convolutional neural network is an encoder-decoder network structure;

step S120: acquiring a remote sensing image training set, and carrying out proximity sampling on remote sensing images in the remote sensing image training set through a proximity sampler to obtain a group of training proximity image groups;

step S130: inputting the training adjacent image group into an initial convolutional neural network, and extracting adjacent features to obtain an adjacent feature group;

step S140: carrying out proximity feature polymerization on the proximity feature group to obtain features after proximity polymerization;

step S150: carrying out post-processing and point constraint processing on the characteristics after the adjacent aggregation to obtain a pseudo label;

step S160: replacing the pseudo label with the point label monitoring information in the initial convolutional neural network, repeating the steps S120 to S130 to obtain a prediction probability map group, carrying out binarization processing on the prediction probability map group to obtain a training prediction result group, and voting the training prediction result group according to a preset voting principle to obtain a comprehensive prediction result;

step S180: and (5) replacing the pseudo label with the comprehensive prediction result, repeating the step (S160) to carry out iterative training, and finishing the training when the iteration times reach the preset iteration times to obtain the trained convolutional neural network.

Preferably, after step S160 and before step S180, the method further includes:

step S170: calculating to obtain binary cross entropy L according to the comprehensive prediction result and the point label supervision information_bceAnd Dice loss function L_diceAccording to a binary cross entropy L_bceAnd Dice loss function L_diceAnd summing to obtain the total training loss, and performing back propagation on the initial convolutional neural network according to the total training loss to update the parameters in the initial convolutional neural network.

Preferably, step S140 includes:

step S141: obtaining an intermediate result using maximal pooling along the channel dimension for each neighboring feature in the set of neighboring features;

step S142: carrying out binarization processing on the intermediate results of all adjacent features by using a maximum inter-class variance method to obtain adjacent feature graphs of all adjacent features;

step S143: and voting the adjacent feature maps of all the adjacent features of the adjacent feature group according to a preset voting principle to obtain the features after the adjacent aggregation.

Preferably, step S150 includes:

step S151: filling holes in the features after adjacent polymerization, and performing morphological opening operation on the polymerization features after hole filling to remove noise points;

step S152: and screening the polymerization characteristics after the noise points are removed by using point constraint processing to obtain a pseudo label.

Preferably, the encoder-decoder network structure specifically includes a plurality of encoding modules and a plurality of decoding modules, each encoding module includes two consecutive 3 × 3 convolutional layers, one batch normalization layer, one modified linear unit and one max pooling layer, the last decoding module includes two consecutive 3 × 3 convolutional layers, one batch normalization layer, one modified linear unit and one sigmoid function, and the remaining decoding modules include two consecutive 3 × 3 convolutional layers, one batch normalization layer, one modified linear unit and one up-sampling layer.

In one embodiment, a weakly supervised deep learning water body extraction device comprises:

the neural network construction training module is used for constructing an initial convolutional neural network, acquiring a remote sensing image training set, and training the initial convolutional neural network according to the remote sensing image training set to obtain a trained convolutional neural network;

the proximity sampling module is used for acquiring an original remote sensing image and carrying out proximity sampling on the original remote sensing image through a proximity sampler to obtain a group of proximity image groups;

the water body extraction probability map group prediction module is used for inputting the adjacent image group into the trained convolutional neural network to obtain a predicted water body extraction probability map group;

and the water body extraction result determining module is used for carrying out binarization processing on the predicted water body extraction probability map set to obtain a prediction result set, and voting the prediction result set according to a preset voting principle to determine to obtain a final water body extraction result.

In an embodiment, a computer device comprises a memory and a processor, the memory storing a computer program, the processor implementing the steps of the above method when executing the computer program.

In an embodiment, a computer-readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of the above-mentioned method.

According to the method, the device, the computer equipment and the medium for extracting the water body in the weak supervision deep learning, the original remote sensing image is subjected to proximity sampling through the proximity sampler to obtain a group of proximity image groups, decoupling of water body and non-water body characteristics is facilitated, accuracy of water body extraction is improved, extraction performance is improved, the group of proximity image groups is input into a trained convolutional neural network, water body characteristics of the proximity image are aggregated to generate a predicted water body extraction probability map, false extraction and missing extraction areas of the water body are reduced remarkably, finally, binarization processing is carried out on the predicted water body extraction probability map group to obtain a prediction result group, voting is carried out on the prediction result group according to a preset voting principle to obtain a final water body extraction result, and accuracy of water body area prediction is improved effectively.

Drawings

Fig. 1 is a flowchart of a method for extracting a water body in weak supervised deep learning according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a convolutional neural network training process according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of adjacent sampler adjacent sampling according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a portion of a neighborhood feature aggregation proposed in one embodiment of the present invention;

FIG. 5 is a schematic view of a post-processing section according to an embodiment of the present invention;

FIG. 6 is a table comparing test results of one embodiment of the method of the present invention to other prior art methods;

FIG. 7 is a schematic diagram of comparison of water extraction results between one embodiment of the present invention and other prior art methods;

FIG. 8 is a comparison of test results of iterative training as provided by one embodiment of the present invention;

fig. 9 is a schematic diagram illustrating comparison of water body extraction results of iterative training according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the present invention is further described in detail below with reference to the accompanying drawings.

In one embodiment, as shown in fig. 1, a method for extracting a weakly supervised deep learning water body includes the following steps:

step S100: and constructing an initial convolutional neural network, acquiring a remote sensing image training set, and training the initial convolutional neural network according to the remote sensing image training set to obtain the trained convolutional neural network.

In recent years, deep learning becomes a new hot spot in the field of artificial intelligence, the rapid development of deep learning technology and the improvement of computer hardware performance enable the deep learning, especially the convolutional neural network, to be successfully applied in a plurality of fields such as image classification, target detection, semantic segmentation and the like, the performance of the convolutional neural network exceeds a plurality of traditional algorithms, and the problem of converting water body extraction into two-classification semantic segmentation is necessary. In the method, the trained convolutional neural network is obtained by constructing and training the initial convolutional neural network, and due to the strong capability of the convolutional neural network for capturing the information characteristics of the remote sensing image, the defects of the traditional method can be well overcome by using the convolutional neural network as a method for automatically extracting the characteristics, so that the accuracy of water extraction is improved.

In one embodiment, step S100 includes:

step S110: and constructing an initial convolutional neural network, wherein the structure of the initial convolutional neural network is an encoder-decoder network structure.

Further, the encoder-decoder network structure specifically includes a plurality of encoding modules and a plurality of decoding modules, each encoding module includes two consecutive 3 × 3 convolutional layers, a batch normalization layer, a modified linear unit and a max pooling layer, the last decoding module includes two consecutive 3 × 3 convolutional layers, a batch normalization layer, a modified linear unit and a sigmoid function, and the remaining decoding modules include two consecutive 3 × 3 convolutional layers, a batch normalization layer, a modified linear unit and an upsampling layer.

Step S120: and acquiring a remote sensing image training set, and carrying out proximity sampling on the remote sensing images in the remote sensing image training set through a proximity sampler to obtain a group of training proximity image groups.

Specifically, as shown in fig. 2, in the present invention, the remote sensing images are all corresponding high-resolution visible light remote sensing images, and performing proximity sampling on the high-resolution visible light remote sensing image x refers to sampling, with respect to the same high-resolution visible light remote sensing image x, an adjacent sampler N = (N1, N2.. ni), and acquiring a corresponding adjacent image ni (x) using ni in the adjacent sampler N, where i represents an i-th adjacent image in an adjacent image group, and { N1(x), N2(x),. ni (x) } is the adjacent image group sampled by the adjacent sampler N = (N1, N2.. ni).

Further, the width and height of the remote sensing images in the remote sensing image training set are W and H, respectively, the size of each grid is K × K, and step S120 includes:

step S121: dividing remote sensing images in a training set of remote sensing images into

A grid;

step S122: from

All elements of the grids in the ith row and the j column are selected from the grids and are respectively regarded as the (i, j) th elements of the training adjacent image group in the adjacent sampler, and a group of training adjacent image groups is generated until all grid sampling is completed, wherein,

，

。

specifically, when K takes 2, for the ith row and j column grids, the adjacent positions of upper left, upper right, lower left and lower right are selected, which are respectively regarded as the (i, j) th element of N = (N1, N2, N3, N4), and the previous steps are repeated for all W/K × H/K grids until all grid sampling is completed, generating the adjacent sampler N = (N1, N2, N3, N4). Given an image x, a set of neighboring maps (n 1(x), n2(x), n3(x), n4 (x)) are generated, where each image size is W/K × H/K.

As shown in fig. 3, in this example, an example of generating a neighboring image group from a single image x by a neighboring sampler is given, where K =2 and the size of the grid is 2 × 2. The upper left, upper right, lower left and lower right adjacent elements are selected in each grid in turn, and are filled with Ai, Bi, Ci and Di respectively, and i represents the pixel of the element in the ith grid. All pixels filled with Ai are taken as pixels of the down-sampled image n1(x) and pixels filled with Bi are taken as pixels of the other down-sampled image n1(x), and the same is true. The obtained adjacent image groups (n 1(x), n2(x), n3(x), n4 (x)) are shown on the right side in fig. 3. The weak supervised learning is carried out through the adjacent sampler and the convolutional neural network, a pixel level label is not needed, and the labor and financial cost of remote sensing image labeling is effectively saved.

Step S130: and inputting the training adjacent image group into an initial convolutional neural network, and extracting adjacent features to obtain an adjacent feature group.

In particular, as shown in fig. 2, the end-to-end convolutional neural network described for the neighboring group of images { N1(x), N2(x), N3(x), N4(x) } feature extraction uses an encoder-decoder network structure that supervises training using a point label w, in fig. 2, N denotes a neighboring sampler, E_kDenotes the kth coding Module, D_kRepresenting the kth decoding module, S represents a sigmoid function, fi (x) represents the feature extracted from the 2 nd last convolutional layer of the last decoding module, wherein i is the feature extracted from the ith adjacent image ni (x), and the feature is extracted from each adjacent image in sequenceThe neighboring feature group is denoted as { f1(x), f2(x), f3(x), f4(x) }.

Step S140: and carrying out adjacent feature aggregation on the adjacent feature group to obtain adjacent aggregated features.

Further, step S140 includes:

Specifically, in fig. 4, the neighboring feature group { f1, f2, f3, f4} with B × C × W/2 × H/2 × I as the input dimension is B × C × W/2 × H/2 × I, where B is a batch, C is the number of channels of the feature map, W, H is the length and width of the input image x, respectively, and I is the number of a group of neighboring images, and in our method, when the grid size K =2 in the neighboring sampler, I = 4. And (3) obtaining BxI xW/2 xH/2 features by using maximum pooling (C-Maxboosting) along the channel dimension for each adjacent feature with the size of BxC xW/2 xH/2, then performing binarization processing by using a maximum inter-class variance method (Ostu) to obtain a feature map with the size of BxI xW/2 xH/2, and finally voting the feature map in the I direction to obtain a B xW/2 xH/2 adjacent aggregated feature F. The method can distinguish the water body from other natural ground objects by utilizing the characteristic that the intra-class difference between the adjacent pixels of the water body is smaller than the intra-class difference between the adjacent pixels of other natural ground objects, so that the water body contour in the extraction result is clearer.

Step S150: and carrying out post-processing and point constraint processing on the adjacent aggregated features to obtain the pseudo label. Specifically, Post Processing (PP) includes region filling and morphological Processing.

Further, as shown in fig. 5, step S150 includes:

step S151: and filling holes in the adjacent polymerized features, and performing morphological opening operation on the polymerized features after hole filling to remove noise. Specifically, the filling of the void (Fill) is to Fill the void in the closed area into the water body area.

Specifically, the neighbor aggregation feature is first filled in, and then noise is removed using an Open operation (Open operation). Then, Point label constraint (Point constraint) is performed on the processed image, and the specific process is as follows: if the region in the image contains the point label, the whole region is reserved; if the area in the image does not contain the point label, the area is not reserved.

Step S160: and replacing the pseudo label with the point label monitoring information in the initial convolutional neural network, repeating the steps from S120 to S130 to obtain a prediction probability map set, carrying out binarization processing on the prediction probability map set to obtain a training prediction result set, and voting the training prediction result set according to a preset voting principle to obtain a comprehensive prediction result.

Specifically, as shown in fig. 2, the pseudo tag m replaces the point tag w supervision information of the encoder-decoder network, and the network is trained according to steps S120 to S130. Obtaining a prediction probability map set p from a last layer decoding module₁,p₂,p₃,p₄}，P_iA probability map representing the acquisition of the ith adjacent image; for the prediction probability map set p₁,p₂,p₃,p₄Binarization is carried out to obtain a training prediction result set k₁,k₂,k₃,k₄Taking a binary threshold value of 0.5; predicting result set k for training according to preset voting principle₁,k₂,k₃,k₄And voting to obtain a comprehensive prediction result k, wherein the preset voting principle is that majority obeys minority principle.

Specifically, the comprehensive prediction result replaces the pseudo label, that is, the current comprehensive prediction result is used as new supervision information to participate in the training process, and step S160 is repeated.

In one embodiment, after step S160 and before step S180, the method further comprises:

Specifically, the accuracy of extracting the water body by the convolutional neural network can be further improved by training parameters in the convolutional neural network through forward propagation and backward propagation.

In one embodiment, step S200: and acquiring an original remote sensing image, and carrying out proximity sampling on the original remote sensing image through a proximity sampler to obtain a group of proximity image groups.

Further, the width and height of the original remote sensing image are W and H, respectively, the size of each grid is K × K, and step S200 includes:

step S210: dividing a remotely sensed image into

A grid;

step S220: from

，

。

in one embodiment, step S300: and inputting the adjacent image group into a trained convolutional neural network to obtain a predicted water body extraction probability map group.

Specifically, in the formal testing stage, a test remote sensing image is input into a trained convolutional neural network structure without a key point label, and a predicted water body extraction probability map group { p } is output from the sigmoid function module part₁,p₂,p₃,p₄}。

In one embodiment, step S400: and carrying out binarization processing on the predicted water body extraction probability map group to obtain a prediction result group, and voting the prediction result group according to a preset voting principle to determine to obtain a final water body extraction result.

Specifically, a probability map set { p is extracted for the predicted water body₁,p₂,p₃,p₄Binarization is carried out to obtain a prediction result set k₁,k₂,k₃,k₄Get 0.5 for the binary threshold, and set the prediction results { k }₁,k₂,k₃,k₄And voting to obtain a comprehensive prediction result k.

The weak supervision deep learning water body extraction method distinguishes water body and other natural ground objects by utilizing the characteristic that the in-class difference between adjacent pixels of the water body is smaller than the in-class difference between adjacent pixels of other natural ground objects, obtains a group of adjacent image groups by adjacent sampling of an original remote sensing image through an adjacent sampler, is favorable for decoupling water body and non-water body characteristics, improves the precision of water body extraction, improves the extraction performance, inputs the adjacent image groups into a trained convolutional neural network, aggregates the water body characteristics of the adjacent images to generate a predicted water body extraction probability map, obviously reduces the false extraction and missing extraction areas of the water body, finally carries out binarization processing on the predicted water body extraction probability map group to obtain a prediction result group, carries out voting on the prediction result group according to a preset voting principle to obtain a final water body extraction result, the accuracy rate of water body area prediction is effectively improved, and the prediction result can be close to the full-supervision level only by using point label supervision information.

In a detailed embodiment, the whole process of the weak supervised deep learning water extraction method includes:

1) constructing an initial convolutional neural network, sequentially generating an adjacent image group { n1(x), n2(x), n3(x), n4(x) }, an adjacent feature group { F1(x), F2(x), F3(x), F4(x) }, an adjacent aggregated feature F and a pseudo label m from a remote sensing image x in a remote sensing image training set according to the steps S120 to S150, wherein the steps are not repeated;

2) replacing point label w supervision information of the encoder-decoder network by the pseudo label m, training the network according to the step S120 and the step S130, and obtaining a prediction probability map set { p) from the last layer of decoding module₁,p₂,p₃,p₄In FIG. 2, P_iA probability map representing the acquisition of the ith adjacent image;

3) for the set of prediction probability maps { p }₁,p₂,p₃,p₄Binarization is carried out to obtain a prediction result set k₁,k₂,k₃,k₄Taking a binary threshold value of 0.5;

4) and a prediction result set { k }₁,k₂,k₃,k₄Voting to obtain a comprehensive prediction result k;

5) and replacing the pseudo label m in the step 2) by using k, and repeating the steps 2) to 4) to realize iterative training.

6) And predicting by using the trained encoder-decoder network, and carrying out binarization processing on the predicted probability map to obtain a final prediction result y.

In this embodiment, in the network training stage before step 6), binary cross entropy L is used_bceAnd Dice loss function L_diceAnd performing joint supervision. Specifically, in each iteration training process, a binary cross entropy L is calculated according to each decoding output result and the point label supervision information respectively_bceAnd Dice loss function L_diceThen, the binary cross entropy L is calculated_bceAnd Dice loss function L_diceAnd summing to obtain the total training loss, performing back propagation, and repeating iteration until the iteration number reaches a set threshold value to judge that the training is finished.

Under testIn the test stage, key point labels are not needed, test images are input into a trained convolutional neural network structure, and a prediction probability map set { p is obtained through output of a Sigmoid function module₁,p₂,p₃,p₄Binarization is carried out to obtain a prediction result set k₁,k₂,k₃,k₄And (6) taking 0.5 as a binary threshold value, and carrying out prediction on a prediction result set k according to a preset voting principle₁,k₂,k₃,k₄And voting to obtain a comprehensive prediction result k.

To verify the effectiveness of the convolutional neural network, the present embodiment performs training and testing of the network framework using the water body extraction dataset, and contrasts with other methods. The water volume data set comprises 700 training images and point labels, 300 test images, and each set of visible light remote sensing image with the resolution of 492 multiplied by 3. The algorithm proposed in this example (i.e., NFACNN) is compared with the fully supervised approach and the point label supervision. The specific results are shown in FIG. 6. The evaluation indexes are 6 types, namely foreground cross-over ratio (fgIoU), background cross-over ratio (bgIoU), average cross-over ratio (mIoU), foreground dice coefficient (fgDice), background dice coefficient (bgDice) and average dice coefficient (mDice) in sequence. As can be seen from fig. 6, in all of the 6 evaluation indexes, the method of this embodiment is higher than the monitoring of the Point label, where Point (op) represents a positive example using the Point label, and Point (W1) represents that the ratio of the positive example to the negative example using the cross entropy is 10: 1, Point (W2) indicates that the ratio of positive and negative examples using cross entropy is 2: point (bce) denotes the use of cross entropy, NFACNN denotes the use of a neighbor aggregation convolutional neural network, and Full Supervision denotes the fully supervised approach. In addition, the mIoU of the method NFACNN of the embodiment is only 4.19% lower than the full supervision, and mDice is only 3.06% lower than the full supervision, which is close to the full supervision level. Fig. 7 is a qualitative analysis result of two remote sensing images by different methods, point label supervision has more error prediction and missing prediction, and the result of the NFACNN method is very close to full supervision, and although some water body parts have a certain small range of errors, the full supervision is more accurate than the full supervision in some parts and is consistent with the full supervision label. In conclusion, the weak supervision technology used by the method can improve the accuracy of water body area prediction. The adjacent sampler is beneficial to decoupling the characteristics of the water body and the non-water body, the accuracy of water body extraction is improved, the extraction performance is improved, the adjacent characteristic polymerization part polymerizes the characteristics of the water body of the adjacent image, the false extraction and missing extraction areas of the water body are obviously reduced, and the shape characteristics of the water body are kept more perfectly.

In order to verify the effectiveness of the iterative training, the embodiment compares the prediction results of different iteration times. After three times of iterative training, the result is gradually improved. Fig. 8 shows the trend of each round of iterative training, and our method uses the mliou from 71.93% to 74.29% and mDice from 82.52% to 84.37% after iterative training. Fig. 9 shows the water body extraction prediction result of each iteration training, and it can be seen that the iteration training can gradually correct the boundary, the boundary is not smooth after the first iteration, more burrs exist, the boundary of the water body region is smoother after the second iteration, and the boundary between the water body and the non-water body is more accurate after the third iteration. The iterative training method can gradually correct the boundary of the water body outline, so that the water body outline is smooth and natural when in transition and is closer to a fully supervised prediction result.

In one embodiment, the weak supervision deep learning water body extraction device comprises a neural network construction training module, a proximity sampling module, a water body extraction probability map group prediction module and a water body extraction result determination module, wherein the neural network construction training module is used for constructing an initial convolutional neural network, acquiring a remote sensing image training set, and training the initial convolutional neural network according to the remote sensing image training set to obtain a trained convolutional neural network; the proximity sampling module is used for acquiring an original remote sensing image, and performing proximity sampling on the original remote sensing image through a proximity sampler to obtain a group of proximity image groups; the water body extraction probability map group prediction module is used for inputting the adjacent image group into the trained convolutional neural network to obtain a predicted water body extraction probability map group; and the water body extraction result determining module is used for carrying out binarization processing on the predicted water body extraction probability map set to obtain a prediction result set, and voting the prediction result set according to a preset voting principle to determine to obtain a final water body extraction result.

For specific limitations of the weak supervised deep learning water body extraction device, reference may be made to the above limitations of the weak supervised deep learning water body extraction method, and details are not described here. All modules in the weak supervision deep learning water body extraction device can be completely or partially realized through software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device comprises a memory and a processor, the memory stores a computer program, and the processor realizes the steps of the weak supervised deep learning water body extraction method when executing the computer program.

In one embodiment, a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the above-described weak supervised deep learning water body extraction method

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The method, the device, the computer equipment and the medium for extracting the weakly supervised deep learning water body provided by the invention are described in detail above. The principles and embodiments of the present invention are explained herein using specific examples, which are presented only to assist in understanding the core concepts of the present invention. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

Claims

1. The method for extracting the water body through weak supervision deep learning is characterized by comprising the following steps of:

step S300: inputting the adjacent image group into the trained convolutional neural network to obtain a predicted water body extraction probability map group;

step S400: carrying out binarization processing on the predicted water body extraction probability map group to obtain a prediction result group, and voting the prediction result group according to a preset voting principle to determine to obtain a final water body extraction result;

the step S100 includes:

step S130: inputting the training adjacent image group into the initial convolutional neural network, and extracting adjacent features to obtain an adjacent feature group;

step S140: carrying out proximity feature aggregation on the proximity feature group to obtain features after proximity aggregation;

step S160: replacing the pseudo label with point label monitoring information in the initial convolutional neural network, repeating the steps S120 to S130 to obtain a prediction probability map group, carrying out binarization processing on the prediction probability map group to obtain a training prediction result group, and voting the training prediction result group according to a preset voting principle to obtain a comprehensive prediction result;

step S180: and replacing the pseudo label with the comprehensive prediction result, repeating the step S160 to carry out iterative training, and finishing the training when the iteration times reach the preset iteration times to obtain the trained convolutional neural network.

2. The method according to claim 1, wherein the width and height of the original remote sensing image are W and H, respectively, and the size of each grid is K x K, and step S200 comprises:

step S210: dividing the original remote sensing image into

A grid;

step S220: from the above

，

。

3. the method of claim 1, wherein step S160 is followed by step S180 and further comprising:

step S170: calculating to obtain a binary cross entropy L according to the comprehensive prediction result and the point label supervision information_bceAnd Dice loss function L_diceAccording to said binary cross entropy L_bceAnd the Dice loss function L_diceAnd summing to obtain total training loss, and performing back propagation on the initial convolutional neural network according to the total training loss to update parameters in the initial convolutional neural network.

4. The method of claim 1, wherein step S140 comprises:

step S141: obtaining an intermediate result using maximal pooling along a channel dimension for each neighboring feature in the set of neighboring features;

5. The method of claim 1, wherein step S150 comprises:

step S151: filling holes in the features after the adjacent polymerization, and performing morphological opening operation on the polymerization features after the hole filling to remove noise points;

6. The method of claim 1, wherein the encoder-decoder network structure comprises a plurality of encoding modules and a plurality of decoding modules, each encoding module comprising two consecutive 3 x 3 convolutional layers, a batch normalization layer, a modified linear unit, and a max pooling layer, the last decoding module comprising two consecutive 3 x 3 convolutional layers, a batch normalization layer, a modified linear unit, and a sigmoid function, and the remaining decoding modules comprising two consecutive 3 x 3 convolutional layers, a batch normalization layer, a modified linear unit, and an upsampling layer.

7. Weakly supervised deep learning water extraction device, its characterized in that, the device includes:

the system comprises an adjacent sampling module, a remote sensing image acquisition module and a data acquisition module, wherein the adjacent sampling module is used for acquiring an original remote sensing image and carrying out adjacent sampling on the original remote sensing image through an adjacent sampler to obtain a group of adjacent image groups;

the water body extraction result determining module is used for carrying out binarization processing on the predicted water body extraction probability map set to obtain a prediction result set, and voting the prediction result set according to a preset voting principle to determine to obtain a final water body extraction result;

the neural network construction training module comprises a step of constructing an initial convolutional neural network, wherein the structure of the initial convolutional neural network is an encoder-decoder network structure; acquiring a remote sensing image training set, and carrying out proximity sampling on remote sensing images in the remote sensing image training set through a proximity sampler to obtain a group of training proximity image groups; inputting the training adjacent image group into the initial convolutional neural network, and extracting adjacent features to obtain an adjacent feature group; carrying out proximity feature aggregation on the proximity feature group to obtain features after proximity aggregation; carrying out post-processing and point constraint processing on the characteristics after the adjacent aggregation to obtain a pseudo label; replacing the pseudo label with point label monitoring information in the initial convolutional neural network, repeating the steps of obtaining a training adjacent image group and obtaining an adjacent characteristic group to obtain a prediction probability image group, carrying out binarization processing on the prediction probability image group to obtain a training prediction result group, and voting the training prediction result group according to a preset voting principle to obtain a comprehensive prediction result; and replacing the pseudo label with the comprehensive prediction result, repeating the step of obtaining the comprehensive prediction result to carry out iterative training, and finishing the training when the iteration times reach the preset iteration times to obtain the trained convolutional neural network.

8. Computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method of any of claims 1 to 6 when executing the computer program.

9. Computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.