CN113255581B - Weak supervision deep learning water body extraction method and device, computer equipment and medium - Google Patents

Weak supervision deep learning water body extraction method and device, computer equipment and medium Download PDF

Info

Publication number
CN113255581B
CN113255581B CN202110684292.5A CN202110684292A CN113255581B CN 113255581 B CN113255581 B CN 113255581B CN 202110684292 A CN202110684292 A CN 202110684292A CN 113255581 B CN113255581 B CN 113255581B
Authority
CN
China
Prior art keywords
group
adjacent
neural network
training
convolutional neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110684292.5A
Other languages
Chinese (zh)
Other versions
CN113255581A (en
Inventor
方乐缘
鲁鸣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202110684292.5A priority Critical patent/CN113255581B/en
Publication of CN113255581A publication Critical patent/CN113255581A/en
Application granted granted Critical
Publication of CN113255581B publication Critical patent/CN113255581B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Biology (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a method, a device, computer equipment and a medium for extracting a weakly supervised deep learning water body, which comprises the steps of constructing an initial convolutional neural network, obtaining a remote sensing image training set, and training the initial convolutional neural network according to the remote sensing image training set to obtain a trained convolutional neural network; acquiring an original remote sensing image, and carrying out proximity sampling on the original remote sensing image through a proximity sampler to obtain a group of proximity image groups; inputting the adjacent image group into a trained convolutional neural network to obtain a predicted water body extraction probability map group; and carrying out binarization processing on the predicted water body extraction probability map group to obtain a prediction result group, and voting the prediction result group according to a preset voting principle to determine to obtain a final water body extraction result. Effectively improve the precision of water extraction.

Description

Weak supervision deep learning water body extraction method and device, computer equipment and medium
Technical Field
The invention relates to the technical field of remote sensing image processing, in particular to a method and a device for extracting a weak supervised deep learning water body, computer equipment and a medium.
Background
The water body extraction refers to extracting the surface of a water body from an image, in particular to extracting naturally formed water bodies such as rivers, lakes, seas and the like, artificial water bodies such as paddy fields, reservoirs, canals and the like from a remote sensing image. The traditional remote sensing image water body extraction method generally comprises a single-waveband method, a water body index method, a supervision classification method, an inter-spectrum relation method and other related algorithms. Although the methods make progress in water body extraction, the methods still have the problems of low automation degree, tedious manual features, insufficient extraction precision and the like. Meanwhile, due to rapid progress of the remote sensing technology, surface texture information contained in the high-resolution remote sensing image is more and more refined, edge structure information is more and more abundant, and the traditional remote sensing water body extraction method is difficult to fully utilize high-resolution remote sensing image information with abundant semantics and meet the increasing remote sensing application requirements.
Disclosure of Invention
Aiming at the technical problems, the invention provides a weak supervision deep learning water body extraction method, a device, computer equipment and a medium which can effectively improve the water body extraction precision.
In one embodiment, a method of weakly supervised deep learning water extraction, the method comprising the steps of:
step S100: constructing an initial convolutional neural network, acquiring a remote sensing image training set, and training the initial convolutional neural network according to the remote sensing image training set to obtain a trained convolutional neural network;
step S200: acquiring an original remote sensing image, and carrying out proximity sampling on the original remote sensing image through a proximity sampler to obtain a group of proximity image groups;
step S300: inputting the adjacent image group into a trained convolutional neural network to obtain a predicted water body extraction probability map group;
step S400: and carrying out binarization processing on the predicted water body extraction probability map group to obtain a prediction result group, and voting the prediction result group according to a preset voting principle to determine to obtain a final water body extraction result.
Preferably, the width and height of the original remote sensing image are W and H, respectively, the size of each grid is K × K, and step S200 includes:
step S210: splitting an original remote sensing image into
Figure 100002_DEST_PATH_IMAGE001
A grid;
step S220: from
Figure 842052DEST_PATH_IMAGE001
All elements of the grids in the ith row and the j column are selected from the grids and are respectively regarded as the (i, j) th elements of the adjacent images of the adjacent image group in the adjacent sampler, and until all grid sampling is completed, a group of adjacent image groups are generated, wherein,
Figure 253442DEST_PATH_IMAGE002
Figure 100002_DEST_PATH_IMAGE003
preferably, step S100 includes:
step S110: constructing an initial convolutional neural network, wherein the structure of the initial convolutional neural network is an encoder-decoder network structure;
step S120: acquiring a remote sensing image training set, and carrying out proximity sampling on remote sensing images in the remote sensing image training set through a proximity sampler to obtain a group of training proximity image groups;
step S130: inputting the training adjacent image group into an initial convolutional neural network, and extracting adjacent features to obtain an adjacent feature group;
step S140: carrying out proximity feature polymerization on the proximity feature group to obtain features after proximity polymerization;
step S150: carrying out post-processing and point constraint processing on the characteristics after the adjacent aggregation to obtain a pseudo label;
step S160: replacing the pseudo label with the point label monitoring information in the initial convolutional neural network, repeating the steps S120 to S130 to obtain a prediction probability map group, carrying out binarization processing on the prediction probability map group to obtain a training prediction result group, and voting the training prediction result group according to a preset voting principle to obtain a comprehensive prediction result;
step S180: and (5) replacing the pseudo label with the comprehensive prediction result, repeating the step (S160) to carry out iterative training, and finishing the training when the iteration times reach the preset iteration times to obtain the trained convolutional neural network.
Preferably, after step S160 and before step S180, the method further includes:
step S170: calculating to obtain binary cross entropy L according to the comprehensive prediction result and the point label supervision informationbceAnd Dice loss function LdiceAccording to a binary cross entropy LbceAnd Dice loss function LdiceAnd summing to obtain the total training loss, and performing back propagation on the initial convolutional neural network according to the total training loss to update the parameters in the initial convolutional neural network.
Preferably, step S140 includes:
step S141: obtaining an intermediate result using maximal pooling along the channel dimension for each neighboring feature in the set of neighboring features;
step S142: carrying out binarization processing on the intermediate results of all adjacent features by using a maximum inter-class variance method to obtain adjacent feature graphs of all adjacent features;
step S143: and voting the adjacent feature maps of all the adjacent features of the adjacent feature group according to a preset voting principle to obtain the features after the adjacent aggregation.
Preferably, step S150 includes:
step S151: filling holes in the features after adjacent polymerization, and performing morphological opening operation on the polymerization features after hole filling to remove noise points;
step S152: and screening the polymerization characteristics after the noise points are removed by using point constraint processing to obtain a pseudo label.
Preferably, the encoder-decoder network structure specifically includes a plurality of encoding modules and a plurality of decoding modules, each encoding module includes two consecutive 3 × 3 convolutional layers, one batch normalization layer, one modified linear unit and one max pooling layer, the last decoding module includes two consecutive 3 × 3 convolutional layers, one batch normalization layer, one modified linear unit and one sigmoid function, and the remaining decoding modules include two consecutive 3 × 3 convolutional layers, one batch normalization layer, one modified linear unit and one up-sampling layer.
In one embodiment, a weakly supervised deep learning water body extraction device comprises:
the neural network construction training module is used for constructing an initial convolutional neural network, acquiring a remote sensing image training set, and training the initial convolutional neural network according to the remote sensing image training set to obtain a trained convolutional neural network;
the proximity sampling module is used for acquiring an original remote sensing image and carrying out proximity sampling on the original remote sensing image through a proximity sampler to obtain a group of proximity image groups;
the water body extraction probability map group prediction module is used for inputting the adjacent image group into the trained convolutional neural network to obtain a predicted water body extraction probability map group;
and the water body extraction result determining module is used for carrying out binarization processing on the predicted water body extraction probability map set to obtain a prediction result set, and voting the prediction result set according to a preset voting principle to determine to obtain a final water body extraction result.
In an embodiment, a computer device comprises a memory and a processor, the memory storing a computer program, the processor implementing the steps of the above method when executing the computer program.
In an embodiment, a computer-readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of the above-mentioned method.
According to the method, the device, the computer equipment and the medium for extracting the water body in the weak supervision deep learning, the original remote sensing image is subjected to proximity sampling through the proximity sampler to obtain a group of proximity image groups, decoupling of water body and non-water body characteristics is facilitated, accuracy of water body extraction is improved, extraction performance is improved, the group of proximity image groups is input into a trained convolutional neural network, water body characteristics of the proximity image are aggregated to generate a predicted water body extraction probability map, false extraction and missing extraction areas of the water body are reduced remarkably, finally, binarization processing is carried out on the predicted water body extraction probability map group to obtain a prediction result group, voting is carried out on the prediction result group according to a preset voting principle to obtain a final water body extraction result, and accuracy of water body area prediction is improved effectively.
Drawings
Fig. 1 is a flowchart of a method for extracting a water body in weak supervised deep learning according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a convolutional neural network training process according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of adjacent sampler adjacent sampling according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a portion of a neighborhood feature aggregation proposed in one embodiment of the present invention;
FIG. 5 is a schematic view of a post-processing section according to an embodiment of the present invention;
FIG. 6 is a table comparing test results of one embodiment of the method of the present invention to other prior art methods;
FIG. 7 is a schematic diagram of comparison of water extraction results between one embodiment of the present invention and other prior art methods;
FIG. 8 is a comparison of test results of iterative training as provided by one embodiment of the present invention;
fig. 9 is a schematic diagram illustrating comparison of water body extraction results of iterative training according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the present invention is further described in detail below with reference to the accompanying drawings.
In one embodiment, as shown in fig. 1, a method for extracting a weakly supervised deep learning water body includes the following steps:
step S100: and constructing an initial convolutional neural network, acquiring a remote sensing image training set, and training the initial convolutional neural network according to the remote sensing image training set to obtain the trained convolutional neural network.
In recent years, deep learning becomes a new hot spot in the field of artificial intelligence, the rapid development of deep learning technology and the improvement of computer hardware performance enable the deep learning, especially the convolutional neural network, to be successfully applied in a plurality of fields such as image classification, target detection, semantic segmentation and the like, the performance of the convolutional neural network exceeds a plurality of traditional algorithms, and the problem of converting water body extraction into two-classification semantic segmentation is necessary. In the method, the trained convolutional neural network is obtained by constructing and training the initial convolutional neural network, and due to the strong capability of the convolutional neural network for capturing the information characteristics of the remote sensing image, the defects of the traditional method can be well overcome by using the convolutional neural network as a method for automatically extracting the characteristics, so that the accuracy of water extraction is improved.
In one embodiment, step S100 includes:
step S110: and constructing an initial convolutional neural network, wherein the structure of the initial convolutional neural network is an encoder-decoder network structure.
Further, the encoder-decoder network structure specifically includes a plurality of encoding modules and a plurality of decoding modules, each encoding module includes two consecutive 3 × 3 convolutional layers, a batch normalization layer, a modified linear unit and a max pooling layer, the last decoding module includes two consecutive 3 × 3 convolutional layers, a batch normalization layer, a modified linear unit and a sigmoid function, and the remaining decoding modules include two consecutive 3 × 3 convolutional layers, a batch normalization layer, a modified linear unit and an upsampling layer.
Step S120: and acquiring a remote sensing image training set, and carrying out proximity sampling on the remote sensing images in the remote sensing image training set through a proximity sampler to obtain a group of training proximity image groups.
Specifically, as shown in fig. 2, in the present invention, the remote sensing images are all corresponding high-resolution visible light remote sensing images, and performing proximity sampling on the high-resolution visible light remote sensing image x refers to sampling, with respect to the same high-resolution visible light remote sensing image x, an adjacent sampler N = (N1, N2.. ni), and acquiring a corresponding adjacent image ni (x) using ni in the adjacent sampler N, where i represents an i-th adjacent image in an adjacent image group, and { N1(x), N2(x),. ni (x) } is the adjacent image group sampled by the adjacent sampler N = (N1, N2.. ni).
Further, the width and height of the remote sensing images in the remote sensing image training set are W and H, respectively, the size of each grid is K × K, and step S120 includes:
step S121: dividing remote sensing images in a training set of remote sensing images into
Figure 629060DEST_PATH_IMAGE004
A grid;
step S122: from
Figure 252939DEST_PATH_IMAGE004
All elements of the grids in the ith row and the j column are selected from the grids and are respectively regarded as the (i, j) th elements of the training adjacent image group in the adjacent sampler, and a group of training adjacent image groups is generated until all grid sampling is completed, wherein,
Figure DEST_PATH_IMAGE005
Figure 928771DEST_PATH_IMAGE006
specifically, when K takes 2, for the ith row and j column grids, the adjacent positions of upper left, upper right, lower left and lower right are selected, which are respectively regarded as the (i, j) th element of N = (N1, N2, N3, N4), and the previous steps are repeated for all W/K × H/K grids until all grid sampling is completed, generating the adjacent sampler N = (N1, N2, N3, N4). Given an image x, a set of neighboring maps (n 1(x), n2(x), n3(x), n4 (x)) are generated, where each image size is W/K × H/K.
As shown in fig. 3, in this example, an example of generating a neighboring image group from a single image x by a neighboring sampler is given, where K =2 and the size of the grid is 2 × 2. The upper left, upper right, lower left and lower right adjacent elements are selected in each grid in turn, and are filled with Ai, Bi, Ci and Di respectively, and i represents the pixel of the element in the ith grid. All pixels filled with Ai are taken as pixels of the down-sampled image n1(x) and pixels filled with Bi are taken as pixels of the other down-sampled image n1(x), and the same is true. The obtained adjacent image groups (n 1(x), n2(x), n3(x), n4 (x)) are shown on the right side in fig. 3. The weak supervised learning is carried out through the adjacent sampler and the convolutional neural network, a pixel level label is not needed, and the labor and financial cost of remote sensing image labeling is effectively saved.
Step S130: and inputting the training adjacent image group into an initial convolutional neural network, and extracting adjacent features to obtain an adjacent feature group.
In particular, as shown in fig. 2, the end-to-end convolutional neural network described for the neighboring group of images { N1(x), N2(x), N3(x), N4(x) } feature extraction uses an encoder-decoder network structure that supervises training using a point label w, in fig. 2, N denotes a neighboring sampler, EkDenotes the kth coding Module, DkRepresenting the kth decoding module, S represents a sigmoid function, fi (x) represents the feature extracted from the 2 nd last convolutional layer of the last decoding module, wherein i is the feature extracted from the ith adjacent image ni (x), and the feature is extracted from each adjacent image in sequenceThe neighboring feature group is denoted as { f1(x), f2(x), f3(x), f4(x) }.
Step S140: and carrying out adjacent feature aggregation on the adjacent feature group to obtain adjacent aggregated features.
Further, step S140 includes:
step S141: obtaining an intermediate result using maximal pooling along the channel dimension for each neighboring feature in the set of neighboring features;
step S142: carrying out binarization processing on the intermediate results of all adjacent features by using a maximum inter-class variance method to obtain adjacent feature graphs of all adjacent features;
step S143: and voting the adjacent feature maps of all the adjacent features of the adjacent feature group according to a preset voting principle to obtain the features after the adjacent aggregation.
Specifically, in fig. 4, the neighboring feature group { f1, f2, f3, f4} with B × C × W/2 × H/2 × I as the input dimension is B × C × W/2 × H/2 × I, where B is a batch, C is the number of channels of the feature map, W, H is the length and width of the input image x, respectively, and I is the number of a group of neighboring images, and in our method, when the grid size K =2 in the neighboring sampler, I = 4. And (3) obtaining BxI xW/2 xH/2 features by using maximum pooling (C-Maxboosting) along the channel dimension for each adjacent feature with the size of BxC xW/2 xH/2, then performing binarization processing by using a maximum inter-class variance method (Ostu) to obtain a feature map with the size of BxI xW/2 xH/2, and finally voting the feature map in the I direction to obtain a B xW/2 xH/2 adjacent aggregated feature F. The method can distinguish the water body from other natural ground objects by utilizing the characteristic that the intra-class difference between the adjacent pixels of the water body is smaller than the intra-class difference between the adjacent pixels of other natural ground objects, so that the water body contour in the extraction result is clearer.
Step S150: and carrying out post-processing and point constraint processing on the adjacent aggregated features to obtain the pseudo label. Specifically, Post Processing (PP) includes region filling and morphological Processing.
Further, as shown in fig. 5, step S150 includes:
step S151: and filling holes in the adjacent polymerized features, and performing morphological opening operation on the polymerized features after hole filling to remove noise. Specifically, the filling of the void (Fill) is to Fill the void in the closed area into the water body area.
Step S152: and screening the polymerization characteristics after the noise points are removed by using point constraint processing to obtain a pseudo label.
Specifically, the neighbor aggregation feature is first filled in, and then noise is removed using an Open operation (Open operation). Then, Point label constraint (Point constraint) is performed on the processed image, and the specific process is as follows: if the region in the image contains the point label, the whole region is reserved; if the area in the image does not contain the point label, the area is not reserved.
Step S160: and replacing the pseudo label with the point label monitoring information in the initial convolutional neural network, repeating the steps from S120 to S130 to obtain a prediction probability map set, carrying out binarization processing on the prediction probability map set to obtain a training prediction result set, and voting the training prediction result set according to a preset voting principle to obtain a comprehensive prediction result.
Specifically, as shown in fig. 2, the pseudo tag m replaces the point tag w supervision information of the encoder-decoder network, and the network is trained according to steps S120 to S130. Obtaining a prediction probability map set p from a last layer decoding module1,p2,p3,p4},PiA probability map representing the acquisition of the ith adjacent image; for the prediction probability map set p1,p2,p3,p4Binarization is carried out to obtain a training prediction result set k1,k2,k3,k4Taking a binary threshold value of 0.5; predicting result set k for training according to preset voting principle1,k2,k3,k4And voting to obtain a comprehensive prediction result k, wherein the preset voting principle is that majority obeys minority principle.
Step S180: and (5) replacing the pseudo label with the comprehensive prediction result, repeating the step (S160) to carry out iterative training, and finishing the training when the iteration times reach the preset iteration times to obtain the trained convolutional neural network.
Specifically, the comprehensive prediction result replaces the pseudo label, that is, the current comprehensive prediction result is used as new supervision information to participate in the training process, and step S160 is repeated.
In one embodiment, after step S160 and before step S180, the method further comprises:
step S170: calculating to obtain binary cross entropy L according to the comprehensive prediction result and the point label supervision informationbceAnd Dice loss function LdiceAccording to a binary cross entropy LbceAnd Dice loss function LdiceAnd summing to obtain the total training loss, and performing back propagation on the initial convolutional neural network according to the total training loss to update the parameters in the initial convolutional neural network.
Specifically, the accuracy of extracting the water body by the convolutional neural network can be further improved by training parameters in the convolutional neural network through forward propagation and backward propagation.
In one embodiment, step S200: and acquiring an original remote sensing image, and carrying out proximity sampling on the original remote sensing image through a proximity sampler to obtain a group of proximity image groups.
Further, the width and height of the original remote sensing image are W and H, respectively, the size of each grid is K × K, and step S200 includes:
step S210: dividing a remotely sensed image into
Figure DEST_PATH_IMAGE007
A grid;
step S220: from
Figure 478439DEST_PATH_IMAGE007
All elements of the grids in the ith row and the j column are selected from the grids and are respectively regarded as the (i, j) th elements of the adjacent images of the adjacent image group in the adjacent sampler, and until all grid sampling is completed, a group of adjacent image groups are generated, wherein,
Figure 872511DEST_PATH_IMAGE008
Figure DEST_PATH_IMAGE009
in one embodiment, step S300: and inputting the adjacent image group into a trained convolutional neural network to obtain a predicted water body extraction probability map group.
Specifically, in the formal testing stage, a test remote sensing image is input into a trained convolutional neural network structure without a key point label, and a predicted water body extraction probability map group { p } is output from the sigmoid function module part1,p2,p3,p4}。
In one embodiment, step S400: and carrying out binarization processing on the predicted water body extraction probability map group to obtain a prediction result group, and voting the prediction result group according to a preset voting principle to determine to obtain a final water body extraction result.
Specifically, a probability map set { p is extracted for the predicted water body1,p2,p3,p4Binarization is carried out to obtain a prediction result set k1,k2,k3,k4Get 0.5 for the binary threshold, and set the prediction results { k }1,k2,k3,k4And voting to obtain a comprehensive prediction result k.
The weak supervision deep learning water body extraction method distinguishes water body and other natural ground objects by utilizing the characteristic that the in-class difference between adjacent pixels of the water body is smaller than the in-class difference between adjacent pixels of other natural ground objects, obtains a group of adjacent image groups by adjacent sampling of an original remote sensing image through an adjacent sampler, is favorable for decoupling water body and non-water body characteristics, improves the precision of water body extraction, improves the extraction performance, inputs the adjacent image groups into a trained convolutional neural network, aggregates the water body characteristics of the adjacent images to generate a predicted water body extraction probability map, obviously reduces the false extraction and missing extraction areas of the water body, finally carries out binarization processing on the predicted water body extraction probability map group to obtain a prediction result group, carries out voting on the prediction result group according to a preset voting principle to obtain a final water body extraction result, the accuracy rate of water body area prediction is effectively improved, and the prediction result can be close to the full-supervision level only by using point label supervision information.
In a detailed embodiment, the whole process of the weak supervised deep learning water extraction method includes:
1) constructing an initial convolutional neural network, sequentially generating an adjacent image group { n1(x), n2(x), n3(x), n4(x) }, an adjacent feature group { F1(x), F2(x), F3(x), F4(x) }, an adjacent aggregated feature F and a pseudo label m from a remote sensing image x in a remote sensing image training set according to the steps S120 to S150, wherein the steps are not repeated;
2) replacing point label w supervision information of the encoder-decoder network by the pseudo label m, training the network according to the step S120 and the step S130, and obtaining a prediction probability map set { p) from the last layer of decoding module1,p2,p3,p4In FIG. 2, PiA probability map representing the acquisition of the ith adjacent image;
3) for the set of prediction probability maps { p }1,p2,p3,p4Binarization is carried out to obtain a prediction result set k1,k2,k3,k4Taking a binary threshold value of 0.5;
4) and a prediction result set { k }1,k2,k3,k4Voting to obtain a comprehensive prediction result k;
5) and replacing the pseudo label m in the step 2) by using k, and repeating the steps 2) to 4) to realize iterative training.
6) And predicting by using the trained encoder-decoder network, and carrying out binarization processing on the predicted probability map to obtain a final prediction result y.
In this embodiment, in the network training stage before step 6), binary cross entropy L is usedbceAnd Dice loss function LdiceAnd performing joint supervision. Specifically, in each iteration training process, a binary cross entropy L is calculated according to each decoding output result and the point label supervision information respectivelybceAnd Dice loss function LdiceThen, the binary cross entropy L is calculatedbce And Dice loss function LdiceAnd summing to obtain the total training loss, performing back propagation, and repeating iteration until the iteration number reaches a set threshold value to judge that the training is finished.
Under testIn the test stage, key point labels are not needed, test images are input into a trained convolutional neural network structure, and a prediction probability map set { p is obtained through output of a Sigmoid function module1,p2,p3,p4Binarization is carried out to obtain a prediction result set k1,k2,k3,k4And (6) taking 0.5 as a binary threshold value, and carrying out prediction on a prediction result set k according to a preset voting principle1,k2,k3,k4And voting to obtain a comprehensive prediction result k.
To verify the effectiveness of the convolutional neural network, the present embodiment performs training and testing of the network framework using the water body extraction dataset, and contrasts with other methods. The water volume data set comprises 700 training images and point labels, 300 test images, and each set of visible light remote sensing image with the resolution of 492 multiplied by 3. The algorithm proposed in this example (i.e., NFACNN) is compared with the fully supervised approach and the point label supervision. The specific results are shown in FIG. 6. The evaluation indexes are 6 types, namely foreground cross-over ratio (fgIoU), background cross-over ratio (bgIoU), average cross-over ratio (mIoU), foreground dice coefficient (fgDice), background dice coefficient (bgDice) and average dice coefficient (mDice) in sequence. As can be seen from fig. 6, in all of the 6 evaluation indexes, the method of this embodiment is higher than the monitoring of the Point label, where Point (op) represents a positive example using the Point label, and Point (W1) represents that the ratio of the positive example to the negative example using the cross entropy is 10: 1, Point (W2) indicates that the ratio of positive and negative examples using cross entropy is 2: point (bce) denotes the use of cross entropy, NFACNN denotes the use of a neighbor aggregation convolutional neural network, and Full Supervision denotes the fully supervised approach. In addition, the mIoU of the method NFACNN of the embodiment is only 4.19% lower than the full supervision, and mDice is only 3.06% lower than the full supervision, which is close to the full supervision level. Fig. 7 is a qualitative analysis result of two remote sensing images by different methods, point label supervision has more error prediction and missing prediction, and the result of the NFACNN method is very close to full supervision, and although some water body parts have a certain small range of errors, the full supervision is more accurate than the full supervision in some parts and is consistent with the full supervision label. In conclusion, the weak supervision technology used by the method can improve the accuracy of water body area prediction. The adjacent sampler is beneficial to decoupling the characteristics of the water body and the non-water body, the accuracy of water body extraction is improved, the extraction performance is improved, the adjacent characteristic polymerization part polymerizes the characteristics of the water body of the adjacent image, the false extraction and missing extraction areas of the water body are obviously reduced, and the shape characteristics of the water body are kept more perfectly.
In order to verify the effectiveness of the iterative training, the embodiment compares the prediction results of different iteration times. After three times of iterative training, the result is gradually improved. Fig. 8 shows the trend of each round of iterative training, and our method uses the mliou from 71.93% to 74.29% and mDice from 82.52% to 84.37% after iterative training. Fig. 9 shows the water body extraction prediction result of each iteration training, and it can be seen that the iteration training can gradually correct the boundary, the boundary is not smooth after the first iteration, more burrs exist, the boundary of the water body region is smoother after the second iteration, and the boundary between the water body and the non-water body is more accurate after the third iteration. The iterative training method can gradually correct the boundary of the water body outline, so that the water body outline is smooth and natural when in transition and is closer to a fully supervised prediction result.
In one embodiment, the weak supervision deep learning water body extraction device comprises a neural network construction training module, a proximity sampling module, a water body extraction probability map group prediction module and a water body extraction result determination module, wherein the neural network construction training module is used for constructing an initial convolutional neural network, acquiring a remote sensing image training set, and training the initial convolutional neural network according to the remote sensing image training set to obtain a trained convolutional neural network; the proximity sampling module is used for acquiring an original remote sensing image, and performing proximity sampling on the original remote sensing image through a proximity sampler to obtain a group of proximity image groups; the water body extraction probability map group prediction module is used for inputting the adjacent image group into the trained convolutional neural network to obtain a predicted water body extraction probability map group; and the water body extraction result determining module is used for carrying out binarization processing on the predicted water body extraction probability map set to obtain a prediction result set, and voting the prediction result set according to a preset voting principle to determine to obtain a final water body extraction result.
For specific limitations of the weak supervised deep learning water body extraction device, reference may be made to the above limitations of the weak supervised deep learning water body extraction method, and details are not described here. All modules in the weak supervision deep learning water body extraction device can be completely or partially realized through software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device comprises a memory and a processor, the memory stores a computer program, and the processor realizes the steps of the weak supervised deep learning water body extraction method when executing the computer program.
In one embodiment, a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the above-described weak supervised deep learning water body extraction method
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The method, the device, the computer equipment and the medium for extracting the weakly supervised deep learning water body provided by the invention are described in detail above. The principles and embodiments of the present invention are explained herein using specific examples, which are presented only to assist in understanding the core concepts of the present invention. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

Claims (9)

1. The method for extracting the water body through weak supervision deep learning is characterized by comprising the following steps of:
step S100: constructing an initial convolutional neural network, acquiring a remote sensing image training set, and training the initial convolutional neural network according to the remote sensing image training set to obtain a trained convolutional neural network;
step S200: acquiring an original remote sensing image, and carrying out proximity sampling on the original remote sensing image through a proximity sampler to obtain a group of proximity image groups;
step S300: inputting the adjacent image group into the trained convolutional neural network to obtain a predicted water body extraction probability map group;
step S400: carrying out binarization processing on the predicted water body extraction probability map group to obtain a prediction result group, and voting the prediction result group according to a preset voting principle to determine to obtain a final water body extraction result;
the step S100 includes:
step S110: constructing an initial convolutional neural network, wherein the structure of the initial convolutional neural network is an encoder-decoder network structure;
step S120: acquiring a remote sensing image training set, and carrying out proximity sampling on remote sensing images in the remote sensing image training set through a proximity sampler to obtain a group of training proximity image groups;
step S130: inputting the training adjacent image group into the initial convolutional neural network, and extracting adjacent features to obtain an adjacent feature group;
step S140: carrying out proximity feature aggregation on the proximity feature group to obtain features after proximity aggregation;
step S150: carrying out post-processing and point constraint processing on the characteristics after the adjacent aggregation to obtain a pseudo label;
step S160: replacing the pseudo label with point label monitoring information in the initial convolutional neural network, repeating the steps S120 to S130 to obtain a prediction probability map group, carrying out binarization processing on the prediction probability map group to obtain a training prediction result group, and voting the training prediction result group according to a preset voting principle to obtain a comprehensive prediction result;
step S180: and replacing the pseudo label with the comprehensive prediction result, repeating the step S160 to carry out iterative training, and finishing the training when the iteration times reach the preset iteration times to obtain the trained convolutional neural network.
2. The method according to claim 1, wherein the width and height of the original remote sensing image are W and H, respectively, and the size of each grid is K x K, and step S200 comprises:
step S210: dividing the original remote sensing image into
Figure DEST_PATH_IMAGE001
A grid;
step S220: from the above
Figure 781216DEST_PATH_IMAGE001
All elements of the grids in the ith row and the j column are selected from the grids and are respectively regarded as the (i, j) th elements of the adjacent images of the adjacent image group in the adjacent sampler, and until all grid sampling is completed, a group of adjacent image groups are generated, wherein,
Figure DEST_PATH_IMAGE002
Figure DEST_PATH_IMAGE003
3. the method of claim 1, wherein step S160 is followed by step S180 and further comprising:
step S170: calculating to obtain a binary cross entropy L according to the comprehensive prediction result and the point label supervision informationbceAnd Dice loss function LdiceAccording to said binary cross entropy LbceAnd the Dice loss function LdiceAnd summing to obtain total training loss, and performing back propagation on the initial convolutional neural network according to the total training loss to update parameters in the initial convolutional neural network.
4. The method of claim 1, wherein step S140 comprises:
step S141: obtaining an intermediate result using maximal pooling along a channel dimension for each neighboring feature in the set of neighboring features;
step S142: carrying out binarization processing on the intermediate results of all adjacent features by using a maximum inter-class variance method to obtain adjacent feature graphs of all adjacent features;
step S143: and voting the adjacent feature maps of all the adjacent features of the adjacent feature group according to a preset voting principle to obtain the features after the adjacent aggregation.
5. The method of claim 1, wherein step S150 comprises:
step S151: filling holes in the features after the adjacent polymerization, and performing morphological opening operation on the polymerization features after the hole filling to remove noise points;
step S152: and screening the polymerization characteristics after the noise points are removed by using point constraint processing to obtain a pseudo label.
6. The method of claim 1, wherein the encoder-decoder network structure comprises a plurality of encoding modules and a plurality of decoding modules, each encoding module comprising two consecutive 3 x 3 convolutional layers, a batch normalization layer, a modified linear unit, and a max pooling layer, the last decoding module comprising two consecutive 3 x 3 convolutional layers, a batch normalization layer, a modified linear unit, and a sigmoid function, and the remaining decoding modules comprising two consecutive 3 x 3 convolutional layers, a batch normalization layer, a modified linear unit, and an upsampling layer.
7. Weakly supervised deep learning water extraction device, its characterized in that, the device includes:
the neural network construction training module is used for constructing an initial convolutional neural network, acquiring a remote sensing image training set, and training the initial convolutional neural network according to the remote sensing image training set to obtain a trained convolutional neural network;
the system comprises an adjacent sampling module, a remote sensing image acquisition module and a data acquisition module, wherein the adjacent sampling module is used for acquiring an original remote sensing image and carrying out adjacent sampling on the original remote sensing image through an adjacent sampler to obtain a group of adjacent image groups;
the water body extraction probability map group prediction module is used for inputting the adjacent image group into the trained convolutional neural network to obtain a predicted water body extraction probability map group;
the water body extraction result determining module is used for carrying out binarization processing on the predicted water body extraction probability map set to obtain a prediction result set, and voting the prediction result set according to a preset voting principle to determine to obtain a final water body extraction result;
the neural network construction training module comprises a step of constructing an initial convolutional neural network, wherein the structure of the initial convolutional neural network is an encoder-decoder network structure; acquiring a remote sensing image training set, and carrying out proximity sampling on remote sensing images in the remote sensing image training set through a proximity sampler to obtain a group of training proximity image groups; inputting the training adjacent image group into the initial convolutional neural network, and extracting adjacent features to obtain an adjacent feature group; carrying out proximity feature aggregation on the proximity feature group to obtain features after proximity aggregation; carrying out post-processing and point constraint processing on the characteristics after the adjacent aggregation to obtain a pseudo label; replacing the pseudo label with point label monitoring information in the initial convolutional neural network, repeating the steps of obtaining a training adjacent image group and obtaining an adjacent characteristic group to obtain a prediction probability image group, carrying out binarization processing on the prediction probability image group to obtain a training prediction result group, and voting the training prediction result group according to a preset voting principle to obtain a comprehensive prediction result; and replacing the pseudo label with the comprehensive prediction result, repeating the step of obtaining the comprehensive prediction result to carry out iterative training, and finishing the training when the iteration times reach the preset iteration times to obtain the trained convolutional neural network.
8. Computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method of any of claims 1 to 6 when executing the computer program.
9. Computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
CN202110684292.5A 2021-06-21 2021-06-21 Weak supervision deep learning water body extraction method and device, computer equipment and medium Active CN113255581B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110684292.5A CN113255581B (en) 2021-06-21 2021-06-21 Weak supervision deep learning water body extraction method and device, computer equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110684292.5A CN113255581B (en) 2021-06-21 2021-06-21 Weak supervision deep learning water body extraction method and device, computer equipment and medium

Publications (2)

Publication Number Publication Date
CN113255581A CN113255581A (en) 2021-08-13
CN113255581B true CN113255581B (en) 2021-09-28

Family

ID=77188902

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110684292.5A Active CN113255581B (en) 2021-06-21 2021-06-21 Weak supervision deep learning water body extraction method and device, computer equipment and medium

Country Status (1)

Country Link
CN (1) CN113255581B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019136946A1 (en) * 2018-01-15 2019-07-18 中山大学 Deep learning-based weakly supervised salient object detection method and system
CN111008647A (en) * 2019-11-06 2020-04-14 长安大学 Sample extraction and image classification method based on void convolution and residual linkage
CN111579506A (en) * 2020-04-20 2020-08-25 湖南大学 Multi-camera hyperspectral imaging method, system and medium based on deep learning
CN111612066A (en) * 2020-05-21 2020-09-01 成都理工大学 Remote sensing image classification method based on depth fusion convolutional neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019136946A1 (en) * 2018-01-15 2019-07-18 中山大学 Deep learning-based weakly supervised salient object detection method and system
CN111008647A (en) * 2019-11-06 2020-04-14 长安大学 Sample extraction and image classification method based on void convolution and residual linkage
CN111579506A (en) * 2020-04-20 2020-08-25 湖南大学 Multi-camera hyperspectral imaging method, system and medium based on deep learning
CN111612066A (en) * 2020-05-21 2020-09-01 成都理工大学 Remote sensing image classification method based on depth fusion convolutional neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于无人机数据的植被指数空间尺度效应研究;魏高磊 等;《地理空间信息》;20210430;第19卷(第4期);第4-9页 *

Also Published As

Publication number Publication date
CN113255581A (en) 2021-08-13

Similar Documents

Publication Publication Date Title
CN110930397B (en) Magnetic resonance image segmentation method and device, terminal equipment and storage medium
CN111507990B (en) Tunnel surface defect segmentation method based on deep learning
CN114120102A (en) Boundary-optimized remote sensing image semantic segmentation method, device, equipment and medium
CN111696094B (en) Immunohistochemical PD-L1 membrane staining pathological section image processing method, device and equipment
CN111650453B (en) Power equipment diagnosis method and system based on windowing characteristic Hilbert imaging
CN110728654A (en) Automatic pipeline detection and classification method based on deep residual error neural network
CN111899353A (en) Three-dimensional scanning point cloud hole filling method based on generation countermeasure network
CN111832615A (en) Sample expansion method and system based on foreground and background feature fusion
Qu et al. The algorithm of concrete surface crack detection based on the genetic programming and percolation model
CN113269224B (en) Scene image classification method, system and storage medium
CN112036249B (en) Method, system, medium and terminal for end-to-end pedestrian detection and attribute identification
CN112489023A (en) Pavement crack detection method based on multiple scales and multiple layers
CN116994140A (en) Cultivated land extraction method, device, equipment and medium based on remote sensing image
CN114972759A (en) Remote sensing image semantic segmentation method based on hierarchical contour cost function
CN114283285A (en) Cross consistency self-training remote sensing image semantic segmentation network training method and device
CN114511710A (en) Image target detection method based on convolutional neural network
CN113420619A (en) Remote sensing image building extraction method
CN112906816A (en) Target detection method and device based on optical differential and two-channel neural network
CN115457057A (en) Multi-scale feature fusion gland segmentation method adopting deep supervision strategy
CN113344933B (en) Glandular cell segmentation method based on multi-level feature fusion network
CN113177554B (en) Thyroid nodule identification and segmentation method, system, storage medium and equipment
CN114445356A (en) Multi-resolution-based full-field pathological section image tumor rapid positioning method
CN113255581B (en) Weak supervision deep learning water body extraction method and device, computer equipment and medium
CN112488996A (en) Inhomogeneous three-dimensional esophageal cancer energy spectrum CT (computed tomography) weak supervision automatic labeling method and system
CN112686912B (en) Acute stroke lesion segmentation method based on gradual learning and mixed samples

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant