CN113449640B - Remote sensing image building semantic segmentation edge optimization method based on multitask CNN + GCN - Google Patents

Remote sensing image building semantic segmentation edge optimization method based on multitask CNN + GCN Download PDF

Info

Publication number
CN113449640B
CN113449640B CN202110725267.7A CN202110725267A CN113449640B CN 113449640 B CN113449640 B CN 113449640B CN 202110725267 A CN202110725267 A CN 202110725267A CN 113449640 B CN113449640 B CN 113449640B
Authority
CN
China
Prior art keywords
building
semantic segmentation
edge
gcn
cnn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110725267.7A
Other languages
Chinese (zh)
Other versions
CN113449640A (en
Inventor
刘修国
邓睿哲
陈奇
张丛珊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Geosciences
Original Assignee
China University of Geosciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences filed Critical China University of Geosciences
Priority to CN202110725267.7A priority Critical patent/CN113449640B/en
Publication of CN113449640A publication Critical patent/CN113449640A/en
Application granted granted Critical
Publication of CN113449640B publication Critical patent/CN113449640B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention provides a multitask CNN + GCN-based remote sensing image building semantic segmentation edge optimization method, which adopts CNN to extract high-level semantic features of a building from a remote sensing image and GCN to rapidly perform graph reasoning on a high-resolution original image; then, the deep features with lower resolution from the CNN are mapped to the original image again by adopting a plurality of times of up-sampling, transverse connection and convolution operations, and the building edge extraction and the primary building semantic segmentation are carried out according to the deep features; integrating the deep features with the edge extraction result, and constraining the edge of the primary building semantic segmentation result; and finally, a graph feature self-adaptive optimization module is adopted to promote GCN features to effectively optimize constrained building semantic segmentation results, and the building semantic segmentation results with excellent edge performance are output. The invention has the beneficial effects that: the edge details of the CNN-based remote sensing image building semantic segmentation result are adaptively optimized, and the precision and the application value of the building automatic mapping result are improved.

Description

Remote sensing image building semantic segmentation edge optimization method based on multitask CNN + GCN
Technical Field
The invention relates to the field of surveying and mapping science and technology, in particular to a method for optimizing semantic segmentation edges of a remote sensing image building by utilizing a multitask CNN + GCN semantic segmentation model, and particularly relates to a method for optimizing the semantic segmentation edges of the remote sensing image building based on the multitask CNN + GCN.
Background
Accurate building vector contour information is obtained from the high-resolution remote sensing image, and important basis can be provided for various application fields such as city planning, land survey, illegal building detection and military reconnaissance. Because the human-based visual interpretation and label labeling of the high-resolution remote sensing image are quite high in working time cost, the CNN-based semantic segmentation method is used for intelligently and quickly extracting the remote sensing image building, and the contour extraction and regularization processing are adopted to generate building vector data, so that the method is a more economic and efficient information acquisition mode. However, the CNN-based remote sensing image semantic segmentation method has the disadvantages that local detail textures disappear due to multiple downsampling, the receptive field is limited to be only linearly and slowly increased along with the network depth, the network is difficult to capture large-range global semantic information, the final semantic segmentation result is difficult to be satisfactorily represented on the edge, and the realization of high-precision remote sensing image building automation vectors is still difficult. Therefore, the remote sensing image building semantic segmentation edge optimization technology has important significance for improving the precision, quality and application value of the automatic vector.
The existing CNN-based remote sensing image building semantic segmentation edge optimization technology can be mainly divided into an edge optimization method based on traditional structure modeling, an edge optimization method based on CNN feature enhancement, an edge optimization method based on edge information guidance and an edge optimization method based on graph information integration, but the methods do not consider the characteristics of CNN invariance due to pooling scale, translation and rotation and the characteristics of convolution gradual abstraction characteristics, are difficult to extract spatial fine position information of each pixel point, and how to accurately sense the spatial precise position of each pixel point in a non-CNN manner to optimize a CNN-based remote sensing image building semantic segmentation result does not have a clear solution at present.
Disclosure of Invention
The invention aims to solve the technical problem that the CNN adopted in the prior art is difficult to accurately sense the accurate position of each pixel point in space, and provides a multitask CNN + GCN-based semantic segmentation edge optimization method for a remote sensing image building, which mainly comprises the following steps:
s1, constructing a remote sensing image building sample set;
s2, constructing a multitask CNN + GCN semantic segmentation model by using the ResNet and the GCN as frameworks according to the building sample set;
s3, remapping the resolution of the high-level semantic features of the building output in the step S2 to the original image, and performing building edge extraction and primary building semantic segmentation to respectively obtain a building edge extraction probability map and a primary building semantic segmentation result;
s4, performing feature integration on the building edge extraction probability graph output in the step S3 and the high-level semantic features of the building output in the step S2, and performing semantic segmentation on the building features based on edge constraint to obtain a building semantic segmentation result based on edge constraint;
s5, a graph feature adaptive optimization module is adopted to enable the GCN reasoning features output in the step S1 to optimize the building semantic segmentation result based on the edge constraint output in the step S4, and a building semantic segmentation result with excellent edge performance is output;
and S6, training the multitask CNN + GCN semantic segmentation model by using the three building semantic segmentation results output by the steps S3, S4 and S5 and the building edge extraction probability graph by adopting a backward propagation and random gradient descent algorithm to obtain a trained multitask CNN + GCN semantic segmentation model after reaching preset precision, and inputting the remote sensing image of the actual building into the trained multitask CNN + GCN semantic segmentation model to obtain a more optimized actual building edge semantic segmentation result.
Further, the specific steps of constructing the remote sensing image building sample set are as follows:
s11, converting the original building vector grid into a binary building semantic label image, and adjusting the remote sensing image range to be the same as the binary image;
s12, extracting the outline of the building binary image label to obtain a building edge binary image label;
s13, sliding the remote sensing image, the building semantic label image and the building edge label image at the same time by using a window with a preset size and a preset step length, counting pixel points in the window, marking the proportion of the building by using the number of the pixel points, and performing image cutting work if the proportion is greater than a preset proportion value;
s14, after finishing the cutting work, according to the following steps: a ratio of 3 divides the sample set into a training set and a test set, with 70% of the data being used for training and 30% of the data being used for testing the trained performance.
Further, the ResNet extraction of the high-level semantic features of the building is implemented by a series of convolution and pooling operations, and the remapping of the resolution back to the original image is implemented by a number of upsampling, cross-linking, and convolution operations.
Further, the building semantic segmentation based on edge constraint is obtained by building semantic tag supervised learning, wherein a CNN deep layer feature map is set as CF, a building edge extraction probability map is set as BP, the deep layer semantic features are integrated and constrained by edge information in a CF x (1+ BP) mode, and then the building semantic segmentation based on edge constraint is completed by adopting a full convolution network.
Further, the process of graph reasoning for the original image by using the GCN has the mathematical expression:
Figure BDA0003137462310000031
where A is an adjacency matrix representing the spatial positions and mutual potential relationships of the pixels, and H(l+1)And H(l)Is the vertex feature of the l-th and l + 1-th layers,
Figure BDA0003137462310000032
is the training weight learnable by the l-th layer, sigma is the nonlinear activation function, when l is 0, H0The dimension of the two-dimensional graph structure after dimension transformation is carried out on the input remote sensing image, the dimension is (m multiplied by m) multiplied by n, m multiplied by m is a preset size of a window, n is an input image channel, and m and n are positive integers.
Further, based on the basic principle that any Laplace matrix can be diagonal, the adjacent matrix A is quickly constructed through convolution and matrix operation, and the mathematical expression of the adjacent matrix A is as follows:
A=φ(BP)diag(ρ(BP))φT(BP)
wherein BP isExtracting probability graph of building edge, phi (-) using conventional convolution operation for dimension change, rho (-) being self-adaptive pooling, diag (-) being diagonalization operation, (.)TFor the operation of matrix transposition, the multiplication combination law of matrix operation omits the storage of a huge intermediate result adjacent matrix A, reduces the video memory overhead, ensures the network efficiency, and realizes the process of rapid reasoning of GCN on high-resolution original images, and the mathematical expression is as follows:
Figure BDA0003137462310000033
further, the graph feature adaptive optimization module optimizes the CNN building semantic segmentation result by enhancing the spatial detail features, the graph feature adaptive optimization module takes the building semantic segmentation result based on the edge constraint and the cascade feature of the GCN feature as input, generates an attention map with abundant spatial local details by a conventional convolution operation, and optimizes the building semantic segmentation result based on the edge constraint according to the attention map by the GCN feature, and the mathematical expression of the graph feature adaptive optimization module is as follows:
Seg3=δ(Seg2+δ(Concat(GF,Seg2))*GF)
wherein Seg3Optimizing the results for building semantic segmentation, Seg2For the building semantic segmentation result based on edge constraint, GF is GCN feature, δ (-) is conventional convolution operation, and Concat (-) is feature concatenation.
Further, in the training process, a Loss function is used for calculating a Loss value Loss of the multitask CNN + GCN semantic segmentation model, and a specific mathematical formula is as follows:
Figure BDA0003137462310000041
wherein the content of the first and second substances,
Figure BDA0003137462310000042
for the loss of the difference between the ith semantic segmentation result and the ground surface real categoryThe specific mathematical formula is:
Figure BDA0003137462310000043
wherein n represents a sample number, segn,iPrediction probability, seg, representing whether a pixel in the semantic segmentation result is a building or notn,1Representing the result of the primary building semantic segmentation, segn,2Seg for building semantic segmentation results based on edge constraintsn,3Represents the building semantic segmentation result based on GCN edge optimization,
Figure BDA0003137462310000044
a true value corresponding to the value is shown, and the presence of a building is 1, and the absence of a building is 0;
Ledgethe specific mathematical formula of the loss of the difference between the building edge detection result and the real building edge is as follows:
Figure BDA0003137462310000045
wherein n represents a sample number, BPnRepresenting the probability of whether a pixel in edge detection is a building edge,
Figure BDA0003137462310000046
the true value corresponding to this is 1 for the building edge, and is not 0 for the building edge.
Further, a gradient descent algorithm is applied to the Loss function, when the Loss value Loss approaches convergence, the preset precision is reached, and the multi-task CNN + GCN semantic segmentation model is trained completely.
Furthermore, in the training process, the hyper-parameters including the proportion of positive and negative samples, the learning rate, the batch size and the weight attenuation coefficient need to be debugged according to the test condition.
The technical scheme provided by the invention has the beneficial effects that:
1. the construction process of the adjacent matrix is simulated through matrix diagonalization operation, the storage process of the adjacent matrix is omitted by utilizing simple matrix multiplication combination rate, the GCN can be enabled to directly act on a CNN high-resolution spatial domain characteristic diagram, potential spatial relation among pixel points is explored by utilizing the GCN, the accurate position of the pixel points on the space is accurately sensed, and the method has stronger feasibility compared with a method based on the CNN for accurately acquiring building edge information;
2. the multitask CNN + GCN semantic segmentation model capable of being trained and predicted end to end optimizes a CNN semantic segmentation result by using GCN characteristics rich in accurate position information through a graph characteristic self-adaptive optimization module, and further improves the accuracy and application value of an automatic building extraction result.
Drawings
The invention will be further described with reference to the accompanying drawings and examples, in which:
FIG. 1 is a structural diagram of a multitask CNN + GCN semantic segmentation model of a semantic segmentation edge optimization method for a remote sensing image building in an embodiment of the present invention;
FIG. 2 is a flow chart of building sample set construction in a semantic segmentation edge optimization method for a remote sensing image building according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of fast graph reasoning performed by the GCN in the semantic segmentation edge optimization method for the remote sensing image building according to the embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a graph feature adaptive optimization module in the method for optimizing semantic segmentation edges of a remote sensing image building according to the embodiment of the present invention.
Detailed Description
For a more clear understanding of the technical features, objects and effects of the present invention, embodiments of the present invention will now be described in detail with reference to the accompanying drawings.
The embodiment of the invention provides a remote sensing image building semantic segmentation edge optimization method based on multitask CNN + GCN.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a multitask CNN + GCN semantic segmentation model of a method for optimizing semantic segmentation edges of a remote sensing image building according to an embodiment of the present invention, and the specific steps include:
s1, constructing a remote sensing image building sample set, please refer to fig. 2, which specifically includes the following steps:
s11, converting the original building vector grid into a binary building semantic label image, and adjusting the high-resolution remote sensing image range to be the same as the binary image;
s12, extracting the outline of the building binary image label to obtain a building edge binary image label;
s13, sliding a window with the size of 400 multiplied by 400 and the step length of 100 on a high-resolution remote sensing image, a building semantic label image and a building edge label image simultaneously, counting pixel points in the window, marking the proportion of the building by the number of the pixel points, and performing image cutting if the proportion is more than 10%;
s14, after finishing the cutting work, according to the following steps: the ratio of 3 divides the sample set into a training set and a testing set, 70% of data is used for training, and 30% of data is used for investigating the performance of the model after training.
S2, constructing a multitask CNN + GCN semantic segmentation model by using the building sample set constructed in the step S1 and using ResNet and GCN as frameworks, wherein ResNet extracts high-level semantic features of the building through a series of convolution and pooling operations, and GCN conducts fast graph reasoning on a high-resolution original image through a conventional graph convolution operation to mine potential spatial correlation among pixels.
Referring to fig. 3, the mathematical expression of the fast graph reasoning process for the high resolution original image by using the GCN is:
Figure BDA0003137462310000061
where A is an adjacency matrix representing the spatial positions and mutual potential relationships of the pixels, and H(l+1)And H(l)Is the vertex feature of the l-th and l + 1-th layers,
Figure BDA0003137462310000062
is the training weight learnable by the l-th layer, sigma is the nonlinear activation function, when l is 0, H0The dimension of the two-dimensional graph structure obtained by performing the dimension transformation on the input remote sensing image is (400 × 400) × 3, 400 × 400 is the image size after the cropping, that is, the window size of step S13, and 3 is the input image channel.
Based on the basic principle that any Laplace matrix can be diagonal, the adjacent matrix A is quickly constructed through convolution and matrix operation, and the mathematical expression of the adjacent matrix A is as follows:
A=φ(BP)diag(ρ(BP))φT(BP)
wherein, BP is a probability graph extracted from the building edge, phi (-) adopts the conventional convolution operation for dimension change, rho (-) is self-adaptive pooling, diag (-) is diagonalization operation, (.)TFor the operation of matrix transposition, the multiplication combination law of matrix operation omits the storage of a huge intermediate result adjacent matrix A, reduces the video memory overhead, ensures the network efficiency, and realizes the process of rapid reasoning of GCN on high-resolution original images, and the mathematical expression is as follows:
Figure BDA0003137462310000063
wherein H(l+1)And H(l)Is the vertex characteristics of the l-th layer and the l + 1-th layer, BP is a building edge extraction probability graph, phi (-) adopts the conventional convolution operation for dimension change, rho (-) is self-adaptive pooling, diag (-) is diagonalization operation, ()TIn order to perform the matrix transposition operation,
Figure BDA0003137462310000064
is the training weight learnable by the l-th layer, and σ is the nonlinear activation function. When l is 0, H0The dimension of the two-dimensional graph structure obtained by carrying out dimension transformation on the input high-resolution remote sensing image is (400 multiplied by 400) multiplied by 3. The multiplication combination law of the matrix operation omits the storage of a huge intermediate result adjacent matrix A, reduces the video memory overhead and ensures the network effectAnd the rapid reasoning process of the GCN on the high-resolution original image is realized.
S3, remapping the resolution of the high-level semantic features of the building output in the step S2 to the original image by adopting a plurality of times of up-sampling, transverse connection and convolution operations, and extracting the edge of the building and carrying out primary building semantic segmentation. And when the resolution of the high-level semantic features is remapped back to the original image, a feature pyramid structure is adopted to fuse the deep-layer semantic features and the shallow-layer texture features of the building in the remote sensing image, and the idea of multi-task learning is adopted to simultaneously extract the edges of the building and segment the semantics.
And S4, performing feature integration on the building edge extraction probability graph output in the step S3 and the high-level semantic features of the building output in the step S2, and performing semantic segmentation on the building features based on edge constraint.
In this embodiment, a CNN deep feature map is set as CF, a building edge probability map is set as BP, the process uses edge information to perform integration constraint on deep semantic features in a CF × (1+ BP) manner, and further, a full convolution network is used to complete building semantic segmentation based on edge constraint, and the constraint process is supervised and learned by building semantic tags.
And S5, adopting a graph feature self-adaptive optimization module to enable the GCN inference features output in the step S1 to optimize the building semantic segmentation result based on the edge constraint output in the step S4, and outputting the building semantic segmentation result with excellent edge performance. The graph feature self-adaptive optimization module takes the building semantic segmentation result based on the edge constraint and the cascade feature of the GCN feature as input, generates an attention diagram with rich space local details through conventional convolution operation, and enables the GCN feature to optimize the building semantic segmentation result based on the edge constraint according to the attention diagram, wherein the mathematical expression is as follows:
Seg3=δ(Seg2+δ(Concat(GF,Seg2))*GF)
wherein Seg3Optimizing the results for building semantic segmentation, Seg2For the building semantic segmentation result based on edge constraint, GF is GCN feature, Concat (-) is feature cascade,δ (-) is a conventional convolution operation, see FIG. 4 in particular.
S6, the multitask CNN + GCN semantic segmentation model simultaneously supervises the three building semantic segmentation results and the primary edge extraction results output by the steps S3, S4 and S5, backward propagation and random gradient descent algorithms are adopted for training, the trained multitask CNN + GCN semantic segmentation model is further obtained, and the more optimized practical building edge semantic segmentation result can be obtained by inputting the remote sensing image of the practical building into the trained multitask CNN + GCN semantic segmentation model.
In this embodiment, according to the output item of the model, a Loss function is used to calculate a Loss value Loss of the multitask CNN + GCN semantic segmentation model, and a specific mathematical formula of the Loss value Loss is as follows:
Figure BDA0003137462310000071
wherein the content of the first and second substances,
Figure BDA0003137462310000081
the specific mathematical formula of the loss of the difference between the ith semantic segmentation result and the real earth surface category is as follows:
Figure BDA0003137462310000082
wherein n represents a sample number, segn,iPrediction probability, seg, representing whether a pixel in the semantic segmentation result is a building or notn,1Representing the result of the primary building semantic segmentation, segn,2Seg for building semantic segmentation results based on edge constraintsn,3Represents the building semantic segmentation result based on GCN edge optimization,
Figure BDA0003137462310000083
the true value indicates that there is a building 1 and there is no building 0.
LedgeThe loss of difference between the building edge detection result and the real building edgeThe specific mathematical formula is:
Figure BDA0003137462310000084
wherein n represents a sample number, BPnRepresenting the probability of whether a pixel in edge detection is a building edge,
Figure BDA0003137462310000085
the true value corresponding to this is 1 for the building edge, and is not 0 for the building edge.
And applying a gradient descent algorithm to the Loss function, finishing training of the multi-task CNN + GCN semantic segmentation model when the Loss value Loss approaches convergence, applying the multi-task CNN + GCN semantic segmentation model to an actual execution step, and debugging hyper-parameters such as positive and negative sample proportion, learning rate, batch size, weight attenuation coefficient and the like according to a test condition in the training process.
The invention has the beneficial effects that:
1. the construction process of the adjacent matrix is simulated through matrix diagonalization operation, the storage process of the adjacent matrix is omitted by utilizing simple matrix multiplication combination rate, the GCN can be enabled to directly act on a CNN high-resolution spatial domain characteristic diagram, potential spatial relation among pixel points is explored by utilizing the GCN, the accurate position of the pixel points on the space is accurately sensed, and the method has stronger feasibility compared with a method based on the CNN for accurately acquiring building edge information;
2. the multitask CNN + GCN semantic segmentation model capable of being trained and predicted end to end optimizes a CNN semantic segmentation result by using GCN characteristics rich in accurate position information through a graph characteristic self-adaptive optimization module, and further improves the accuracy and application value of an automatic building extraction result.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A remote sensing image building semantic segmentation edge optimization method based on multitask CNN + GCN is characterized by comprising the following steps: the method comprises the following steps:
s1, constructing a remote sensing image building sample set;
s2, constructing a multitask CNN + GCN semantic segmentation model by using the ResNet and the GCN as frameworks according to the building sample set;
s3, remapping the resolution of the high-level semantic features of the building extracted by ResNet in the step S2 to the original image, and performing building edge extraction and primary building semantic segmentation to respectively obtain a building edge extraction probability map and a primary building semantic segmentation result;
s4, performing feature integration on the building edge extraction probability graph output in the step S3 and the high-level semantic features of the building output in the step S2, and performing semantic segmentation on the building features based on edge constraint to obtain a building semantic segmentation result based on edge constraint;
s5, a graph feature adaptive optimization module is adopted to enable GCN reasoning features to optimize the building semantic segmentation result based on the edge constraint and output a building semantic segmentation result with excellent edge performance, wherein the building semantic segmentation result is output in the step S4;
and S6, training the multitask CNN + GCN semantic segmentation model by using the three building semantic segmentation results output by the steps S3, S4 and S5 and the building edge extraction probability graph by adopting a backward propagation and random gradient descent algorithm, further obtaining a trained multitask CNN + GCN semantic segmentation model, and inputting a remote sensing image of an actual building into the trained multitask CNN + GCN semantic segmentation model to obtain a more optimized actual building edge semantic segmentation result.
2. The method of claim 1, wherein the method comprises the following steps: in step S1, the specific steps of constructing the remote sensing image building sample set are as follows:
s11, converting the original building vector grid into a binary building semantic label image, and adjusting the remote sensing image range to be the same as the binary image;
s12, extracting the outline of the building binary image label to obtain a building edge binary image label;
s13, sliding a window with a preset size and a preset step length on the remote sensing image, the building semantic label image and the building edge label image simultaneously, counting pixel points in the window, marking the proportion of the building by the number of the pixel points, and performing image cutting if the proportion is larger than a preset proportion value;
and S14, after the cutting work is finished, dividing the sample set into a training set and a testing set according to the proportion of 7: 3, wherein 70% of data is used for training, and 30% of data is used for testing the performance after training.
3. The method of claim 1, wherein the method comprises the following steps: the high-level semantic features of the building extracted by ResNet are realized by a series of convolution and pooling operations, and the remapping of the resolution to the original image is realized by a plurality of upsampling, transverse connection and convolution operations.
4. The method of claim 1, wherein the method comprises the following steps: in step S4, the building semantic segmentation based on edge constraint is obtained by building semantic tag supervised learning, where a CNN deep feature map is set as CF, a building edge extraction probability map is set as BP, and deep semantic features are integrated and constrained by edge information in a CF x (1+ BP) manner, and then a full convolution network is used to complete building semantic segmentation based on edge constraint.
5. The method of claim 1, wherein the method comprises the following steps: in step S2, a GCN is used to perform graph reasoning on the original image of the building through a conventional graph rolling operation, and the mathematical expression is as follows:
H(l+1)=σ(AH(l)w(l))
where A is an adjacency matrix representing the spatial positions and mutual potential relationships of the pixels, and H(l+1)And H(l)Is the vertex feature of the l-th and l + 1-th layers,
Figure FDA0003371720860000021
is the training weight learnable by the l-th layer, sigma is the nonlinear activation function, when l is 0, H0The dimension of the two-dimensional graph structure after dimension transformation is carried out on the input remote sensing image, the dimension is (m multiplied by m) multiplied by n, m multiplied by m is a preset size of a window, n is an input image channel, and m and n are positive integers.
6. The method of claim 4, wherein the method comprises the following steps: based on the basic principle that any Laplace matrix can be diagonal, the adjacent matrix A is quickly constructed through convolution and matrix operation, and the mathematical expression of the adjacent matrix A is as follows:
A=φ(BP)diag(ρ(BP))φT(BP)
wherein, BP is a probability graph extracted from the building edge, phi (-) adopts the conventional convolution operation for dimension change, rho (-) is self-adaptive pooling, diag (-) is diagonalization operation, (.)TFor the operation of matrix transposition, the multiplication combination law of matrix operation omits the storage of a huge intermediate result adjacent matrix A, reduces the video memory overhead, ensures the network efficiency, and realizes the process of rapid reasoning of GCN on high-resolution original images, and the mathematical expression is as follows:
Figure FDA0003371720860000022
7. the method of claim 1, wherein the method comprises the following steps: in step S5, the graph feature adaptive optimization module optimizes the CNN building semantic segmentation result by enhancing the spatial detail features, and the graph feature adaptive optimization module takes the building semantic segmentation result based on the edge constraint and the cascade feature of the GCN feature as input, generates an attention diagram with rich spatial local details by a conventional convolution operation, and optimizes the building semantic segmentation result based on the edge constraint by the GCN feature according to the attention diagram, where the mathematical expression is as follows:
Seg3=δ(Seg2+6(Concat(GF,Seg2))*GF)
wherein Seg3Optimizing the results for building semantic segmentation, Seg2For the building semantic segmentation result based on edge constraint, GF is GCN feature, δ (-) is conventional convolution operation, and Concat (-) is feature concatenation.
8. The method of claim 1, wherein the method comprises the following steps: in step S6, in the training process, a Loss function is used to calculate a Loss value Loss of the multitask CNN + GCN semantic segmentation model, and the specific mathematical formula is:
Figure FDA0003371720860000031
wherein the content of the first and second substances,
Figure FDA0003371720860000032
the specific mathematical formula of the loss of the difference between the ith semantic segmentation result and the real earth surface category is as follows:
Figure FDA0003371720860000033
wherein n represents a sample number, Segn,iRepresenting whether pixels in semantic segmentation result are buildingsPredicted probability of (seg)n,1Representing the result of the primary building semantic segmentation, segn,2Seg for building semantic segmentation results based on edge constraintsn,3Represents the building semantic segmentation result based on GCN edge optimization,
Figure FDA0003371720860000034
a true value corresponding to the value is shown, and the presence of a building is 1, and the absence of a building is 0;
Ledgethe specific mathematical formula of the loss of the difference between the building edge detection result and the real building edge is as follows:
Figure FDA0003371720860000035
wherein n represents a sample number, BPnRepresenting the probability of whether a pixel in edge detection is a building edge,
Figure FDA0003371720860000036
the true value corresponding to this is 1 for the building edge, and is not 0 for the building edge.
9. The method of claim 8, wherein the method for optimizing the semantic segmentation edge of the remote sensing image building based on the multitask CNN + GCN comprises the following steps: and applying a gradient descent algorithm to the Loss function, and finishing training the multitask CNN + GCN semantic segmentation model when the Loss value Loss approaches convergence.
10. The method of claim 1, wherein the method comprises the following steps: in step S6, in the training process, the hyper-parameters including the positive and negative sample ratio, the learning rate, the batch size, and the weight attenuation coefficient need to be adjusted according to the test condition.
CN202110725267.7A 2021-06-29 2021-06-29 Remote sensing image building semantic segmentation edge optimization method based on multitask CNN + GCN Active CN113449640B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110725267.7A CN113449640B (en) 2021-06-29 2021-06-29 Remote sensing image building semantic segmentation edge optimization method based on multitask CNN + GCN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110725267.7A CN113449640B (en) 2021-06-29 2021-06-29 Remote sensing image building semantic segmentation edge optimization method based on multitask CNN + GCN

Publications (2)

Publication Number Publication Date
CN113449640A CN113449640A (en) 2021-09-28
CN113449640B true CN113449640B (en) 2022-02-11

Family

ID=77813823

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110725267.7A Active CN113449640B (en) 2021-06-29 2021-06-29 Remote sensing image building semantic segmentation edge optimization method based on multitask CNN + GCN

Country Status (1)

Country Link
CN (1) CN113449640B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114821315A (en) * 2022-04-24 2022-07-29 福州大学 Remote sensing image cultivated land plot extraction method combining edge detection and multitask learning
CN116052006B (en) * 2023-03-29 2023-06-16 山东建筑大学 Building edge optimization method based on multitask learning and dual lottery hypothesis
CN116052018B (en) * 2023-03-31 2023-10-27 北京数慧时空信息技术有限公司 Remote sensing image interpretation method based on life learning
CN116434009B (en) * 2023-04-19 2023-10-24 应急管理部国家减灾中心(应急管理部卫星减灾应用中心) Construction method and system for deep learning sample set of damaged building
CN116977750B (en) * 2023-09-25 2023-12-12 中国地质大学(武汉) Construction method and classification method of land covering scene classification model

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109903314A (en) * 2019-03-13 2019-06-18 腾讯科技(深圳)有限公司 A kind of method, the method for model training and the relevant apparatus of image-region positioning
CN110889449A (en) * 2019-11-27 2020-03-17 中国人民解放军国防科技大学 Edge-enhanced multi-scale remote sensing image building semantic feature extraction method
CN111461258A (en) * 2020-04-26 2020-07-28 武汉大学 Remote sensing image scene classification method of coupling convolution neural network and graph convolution network
US10984245B1 (en) * 2018-06-11 2021-04-20 Facebook, Inc. Convolutional neural network based on groupwise convolution for efficient video analysis
CN112712127A (en) * 2021-01-07 2021-04-27 北京工业大学 Image emotion polarity classification method combined with graph convolution neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10984245B1 (en) * 2018-06-11 2021-04-20 Facebook, Inc. Convolutional neural network based on groupwise convolution for efficient video analysis
CN109903314A (en) * 2019-03-13 2019-06-18 腾讯科技(深圳)有限公司 A kind of method, the method for model training and the relevant apparatus of image-region positioning
CN110889449A (en) * 2019-11-27 2020-03-17 中国人民解放军国防科技大学 Edge-enhanced multi-scale remote sensing image building semantic feature extraction method
CN111461258A (en) * 2020-04-26 2020-07-28 武汉大学 Remote sensing image scene classification method of coupling convolution neural network and graph convolution network
CN112712127A (en) * 2021-01-07 2021-04-27 北京工业大学 Image emotion polarity classification method combined with graph convolution neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A CNN-GCN FRAMEWORK FOR MULTI-LABEL AERIAL IMAGE SCENE CLASSIFICATION;Yansheng Li等;《IGARSS 2020》;20210217;第1353-1356页 *
融合图卷积和差异性池化函数的点云数据分类分割模型;张新良等;《中国图象图形学报》;20201231;第25卷(第6期);第1201-1208页 *

Also Published As

Publication number Publication date
CN113449640A (en) 2021-09-28

Similar Documents

Publication Publication Date Title
CN113449640B (en) Remote sensing image building semantic segmentation edge optimization method based on multitask CNN + GCN
CN111898507B (en) Deep learning method for predicting earth surface coverage category of label-free remote sensing image
CN112836610A (en) Land use change and carbon reserve quantitative estimation method based on remote sensing data
Mao et al. Subsurface velocity inversion from deep learning-based data assimilation
Abdollahi et al. SC-RoadDeepNet: A new shape and connectivity-preserving road extraction deep learning-based network from remote sensing data
CN111539314A (en) Cloud and fog shielding-oriented sea surface target significance detection method
CN114022408A (en) Remote sensing image cloud detection method based on multi-scale convolution neural network
JP2024513596A (en) Image processing method and apparatus and computer readable storage medium
Lv et al. Novel automatic approach for land cover change detection by using VHR remote sensing images
CN114550014A (en) Road segmentation method and computer device
Tian et al. Semantic segmentation of remote sensing image based on GAN and FCN network model
CN113378642A (en) Method for detecting illegal occupation buildings in rural areas
CN111985487A (en) Remote sensing image target extraction method, electronic equipment and storage medium
CN115147727A (en) Method and system for extracting impervious surface of remote sensing image
CN114612315A (en) High-resolution image missing region reconstruction method based on multi-task learning
Yu et al. A lightweight ship detection method in optical remote sensing image under cloud interference
CN112785629A (en) Aurora motion characterization method based on unsupervised deep optical flow network
Li et al. Evaluation of super-resolution on bird detection performance based on deep convolutional networks
CN116958148B (en) Method, device, equipment and medium for detecting defects of key parts of power transmission line
CN116129280B (en) Method for detecting snow in remote sensing image
CN117475155B (en) Lightweight remote sensing image segmentation method based on semi-supervised learning
CN114708513B (en) Edge building extraction method and system considering corner features
Zhang et al. Instance Segmentation of Low-texture Industrial Parts Based on Deep Learning
Liu et al. Low-quality and multi-target detection in rsis
Papadopoulos et al. Convolutional Neural Network for Detection of Building Contours Using Multisource Spatial Data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant