CN115797633A - Remote sensing image segmentation method, system, storage medium and electronic equipment - Google Patents

Remote sensing image segmentation method, system, storage medium and electronic equipment Download PDF

Info

Publication number
CN115797633A
CN115797633A CN202211542414.8A CN202211542414A CN115797633A CN 115797633 A CN115797633 A CN 115797633A CN 202211542414 A CN202211542414 A CN 202211542414A CN 115797633 A CN115797633 A CN 115797633A
Authority
CN
China
Prior art keywords
remote sensing
sensing image
edge
network
image segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211542414.8A
Other languages
Chinese (zh)
Other versions
CN115797633B (en
Inventor
许乐乐
李叶
徐金中
郭丽丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Technology and Engineering Center for Space Utilization of CAS
Original Assignee
Technology and Engineering Center for Space Utilization of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Technology and Engineering Center for Space Utilization of CAS filed Critical Technology and Engineering Center for Space Utilization of CAS
Priority to CN202211542414.8A priority Critical patent/CN115797633B/en
Publication of CN115797633A publication Critical patent/CN115797633A/en
Application granted granted Critical
Publication of CN115797633B publication Critical patent/CN115797633B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a remote sensing image segmentation method, a system, a storage medium and electronic equipment, comprising: constructing a first remote sensing image segmentation model comprising a convolution feature extraction network, an edge semantic auxiliary network, a transform network combining edge enhancement and Gaussian position coding and a segmentation network; the convolution characteristic extraction network, the edge semantic auxiliary network and the segmentation network are respectively connected with a transform network combining edge enhancement and Gaussian position coding; training the first remote sensing image segmentation model based on a plurality of remote sensing image samples to obtain a second remote sensing image segmentation model, and deleting an edge semantic auxiliary network in the second remote sensing image segmentation model to obtain a target remote sensing image segmentation model; and inputting the remote sensing image to be detected into the target remote sensing image segmentation model to obtain a target image segmentation result of the remote sensing image to be detected. The invention improves the fine and accurate segmentation capability under the condition of dense target distribution in the image.

Description

Remote sensing image segmentation method, system, storage medium and electronic equipment
Technical Field
The invention relates to the technical field of image processing, in particular to a remote sensing image segmentation method, a remote sensing image segmentation system, a storage medium and electronic equipment.
Background
In high spatial resolution optical remote sensing images, there are abundant texture details. The ground features in the image are densely distributed, for example, houses are closely adjacent, trees grow densely and cover the houses in a shielding manner, and the like, so that the target edge information is seriously lost. Meanwhile, the interference effect of complex scene information such as illumination, shadow and the like in the image is obvious, which also brings great challenges to the fine and accurate segmentation of the image.
The attention mechanism technology is widely applied to remote sensing image segmentation and achieves remarkable effect. Recently, the transform model has received more and more attention in the field of computer vision because of its advantages in global information extraction. However, when facing an optical remote sensing image in a complex scene such as dense target distribution, edge information is lost more, and the segmentation accuracy needs to be improved.
Therefore, it is desirable to provide a technical solution to solve the above technical problems.
Disclosure of Invention
In order to solve the technical problems, the invention provides a remote sensing image segmentation method, a remote sensing image segmentation system, a storage medium and electronic equipment.
The technical scheme of the remote sensing image segmentation method is as follows:
constructing a first remote sensing image segmentation model comprising a convolution feature extraction network, an edge semantic auxiliary network, a transform network combining edge enhancement and Gaussian position coding and a segmentation network; the convolutional feature extraction network, the edge semantic auxiliary network and the segmentation network are respectively connected with the transform network combining edge enhancement and Gaussian position coding;
training the first remote sensing image segmentation model based on a plurality of remote sensing image samples to obtain a second remote sensing image segmentation model, and deleting the edge semantic auxiliary network in the second remote sensing image segmentation model to obtain a target remote sensing image segmentation model;
and inputting the remote sensing image to be detected into the target remote sensing image segmentation model to obtain a target image segmentation result of the remote sensing image to be detected.
The remote sensing image segmentation method has the following beneficial effects:
according to the method, the remote sensing image is segmented through the convolution feature extraction network, the transform network combining edge enhancement and Gaussian position coding and the segmentation network, and the fine and accurate segmentation capability under the condition that the targets in the image are densely distributed is improved.
On the basis of the scheme, the remote sensing image segmentation method can be further improved as follows.
Further, the method also comprises the following steps:
and obtaining a plurality of remote sensing image samples, and labeling at least two categories in any remote sensing image sample to obtain a semantic labeled image corresponding to the remote sensing image sample until obtaining the semantic labeled image corresponding to each remote sensing image sample.
Further, the step of training the first remote sensing image segmentation model based on the plurality of remote sensing image samples to obtain a second remote sensing image segmentation model comprises:
inputting any remote sensing image sample into the convolution feature extraction network to obtain an initial feature map corresponding to the remote sensing image sample, performing edge extraction on a semantic annotation image corresponding to the remote sensing image sample to obtain a first edge image corresponding to the remote sensing image sample, and inputting the first edge image into the edge semantic auxiliary network to obtain an edge semantic feature map corresponding to the remote sensing image sample;
inputting an initial characteristic diagram and an edge semantic characteristic diagram corresponding to any remote sensing image sample into the transform network combining edge enhancement and Gaussian position coding to obtain and input an enhanced characteristic diagram corresponding to the remote sensing image sample into the segmentation network to obtain a first image segmentation result of the remote sensing image sample;
obtaining a loss value of each remote sensing image sample according to a first image segmentation result and a semantic annotation image corresponding to any remote sensing image sample until a loss value of each remote sensing image sample is obtained;
and optimizing the first remote sensing image segmentation model based on all loss values to obtain an optimized remote sensing image segmentation model, taking the optimized remote sensing image segmentation model as the first remote sensing image segmentation model, returning to execute the step of inputting any remote sensing image sample into the convolution feature extraction network, and determining the optimized remote sensing image segmentation model as the second remote sensing image segmentation model until preset iteration training conditions are met.
Further, the convolutional feature extraction network comprises: at least one first build-up layer; the step of inputting any remote sensing image sample into the convolution feature extraction network to obtain the initial feature map corresponding to the remote sensing image sample comprises the following steps:
and inputting any remote sensing image sample into the convolution feature extraction network to carry out feature extraction through each first convolution layer respectively to obtain an initial feature map corresponding to the remote sensing image sample.
Further, the edge semantic assistance network includes: the edge vectors, the non-edge vectors and the edge semantic layer are sequentially connected; inputting a first edge image corresponding to any remote sensing image sample into the edge semantic auxiliary network to obtain an edge semantic feature map corresponding to the remote sensing image sample, wherein the step comprises the following steps:
inputting a first edge image, an edge vector and a non-edge vector corresponding to any remote sensing image sample into the edge semantic layer for feature extraction to obtain an edge semantic feature map corresponding to the remote sensing image sample.
Further, the transform network combining edge enhancement and gaussian position coding comprises: at least one edge position transformer module, each edge position transformer module comprising: the device comprises a fusion layer, a position coding layer, a first additive layer, a multi-head attention layer, a second additive layer, a full-connection layer and a third additive layer which are arranged in sequence; inputting the initial feature map and the edge semantic feature map corresponding to any remote sensing image sample into the transform network combining edge enhancement and Gaussian position coding to obtain an enhanced feature map corresponding to the remote sensing image sample, wherein the step comprises the following steps:
inputting the initial characteristic diagram and the edge semantic characteristic diagram corresponding to any remote sensing image sample into a fusion layer of a first edge position transformer module for fusion to obtain a first fusion characteristic diagram corresponding to the remote sensing image sample, and performing position coding on each pixel point in the first fusion characteristic diagram corresponding to the remote sensing image sample through a position coding layer of the first edge position transformer module to obtain two-dimensional position coding information of the first fusion characteristic diagram;
inputting a first fusion characteristic diagram corresponding to any remote sensing image sample and two-dimensional position coding information of the first fusion characteristic diagram into a first addition layer of the first edge position transformer module for addition to obtain a first intermediate characteristic diagram corresponding to the remote sensing image sample;
and inputting a first middle feature map corresponding to any remote sensing image sample into a multi-head attention layer, a second additive layer, a full-link layer and a third additive layer of a first edge position transform module which are sequentially connected for processing to obtain a second middle feature map corresponding to the remote sensing image sample, taking the second middle feature map as an initial feature map of a next edge position transform module until the second middle feature map is processed by all the edge position transform modules to obtain an enhanced feature map corresponding to the remote sensing image sample.
The beneficial effect of adopting the further technical scheme is that: the method can further make full use of the enhanced target edge information and the two-dimensional position information in the network, and enhance the training of the remote sensing image segmentation model so as to improve the fine and accurate segmentation capability under the condition of dense target distribution in the image.
Further, the split network includes: at least one second convolutional layer; inputting the enhanced feature map corresponding to any remote sensing image sample into the segmentation network to obtain a first image segmentation result of the remote sensing image sample, wherein the step comprises the following steps:
and inputting the enhanced feature map corresponding to any remote sensing image sample into the segmentation network to perform feature extraction through each second convolution layer respectively to obtain a first image segmentation result of the remote sensing image sample.
The technical scheme of the remote sensing image segmentation system is as follows:
the method comprises the following steps: the system comprises a model construction module, a model training module and an image segmentation module;
the model building module is configured to: constructing a first remote sensing image segmentation model comprising a convolution feature extraction network, an edge semantic auxiliary network, a transform network combining edge enhancement and Gaussian position coding and a segmentation network; wherein the convolutional feature extraction network, the edge semantic auxiliary network and the segmentation network are respectively connected with the transform network combining edge enhancement and Gaussian position coding;
the model training module is configured to: training the first remote sensing image segmentation model based on a plurality of remote sensing image samples to obtain a second remote sensing image segmentation model, and deleting the edge semantic auxiliary network in the second remote sensing image segmentation model to obtain a target remote sensing image segmentation model;
the image segmentation module is to: and inputting the remote sensing image to be detected into the target remote sensing image segmentation model to obtain a target image segmentation result of the remote sensing image to be detected.
The remote sensing image segmentation system has the following beneficial effects:
the system of the invention segments the remote sensing image through the convolution feature extraction network, the transform network combining edge enhancement and Gaussian position coding and the segmentation network, thereby improving the fine and accurate segmentation capability under the condition of dense target distribution in the image.
The technical scheme of the storage medium of the invention is as follows:
the storage medium has stored therein instructions which, when read by a computer, cause the computer to carry out the steps of a method of remote sensing image segmentation in accordance with the invention.
The technical scheme of the electronic equipment is as follows:
comprising a memory, a processor and a computer program stored on the memory and being executable on the processor, characterized in that the processor, when executing the computer program, causes the computer to carry out the steps of a method for remote sensing image segmentation according to the invention.
Drawings
FIG. 1 is a schematic flow chart diagram illustrating an embodiment of a remote sensing image segmentation method provided by the present invention;
FIG. 2 is a flow chart illustrating step 120 of an embodiment of a method for segmenting a remote sensing image according to the present invention;
FIG. 3 is a first schematic structural diagram of a first remote sensing image segmentation model in an embodiment of a remote sensing image segmentation method provided by the invention;
FIG. 4 is a second schematic structural diagram of the first remote sensing image segmentation model in the embodiment of the remote sensing image segmentation method provided by the invention;
fig. 5 shows a schematic structural diagram of an embodiment of a remote sensing image segmentation system provided by the invention.
Detailed Description
Fig. 1 shows a schematic flow chart of a remote sensing image segmentation method according to a first embodiment of the present invention. As shown in fig. 1, the method comprises the steps of:
step 110: and constructing a first remote sensing image segmentation model comprising a convolution feature extraction network N1, an edge semantic auxiliary network N2, a transform network N3 combining edge enhancement and Gaussian position coding and a segmentation network N4.
Wherein, (1) the convolutional feature extraction network N1, the edge semantic auxiliary network N2, and the segmentation network N4 are respectively connected to the transform network N3 that combines edge enhancement and gaussian position coding. (2) The convolution feature extraction network N1 is configured to: an initial feature map with local context information is extracted. (3) The edge semantic auxiliary network N2 is used to: and acquiring an edge semantic feature map containing rich semantic information of the target edge. (4) The transform network N3, which combines edge enhancement with gaussian position coding, is used to: based on the initial feature map, the edge semantic feature map and the two-dimensional position coding vector, an enhanced feature map which has edge semantic enhancement and contains rich global information is extracted. (5) The split network N4 is used to: and acquiring an image segmentation result based on the enhanced feature map. (6) The first remote sensing image segmentation model is as follows: and (5) a remote sensing image segmentation model to be trained.
Step 120: training the first remote sensing image segmentation model based on a plurality of remote sensing image samples to obtain a second remote sensing image segmentation model, and deleting the edge semantic auxiliary network N2 in the second remote sensing image segmentation model to obtain a target remote sensing image segmentation model.
Wherein, (1) the remote sensing image sample is: and the randomly acquired remote sensing image is used for training the first remote sensing image segmentation model. (2) The second remote sensing image segmentation model comprises the following steps: and the trained first remote sensing image segmentation model. (3) The target remote sensing image segmentation model is as follows: deleting the edge semantic auxiliary network N2 in the trained first remote sensing image segmentation model to obtain a model, wherein the model comprises the following steps: the method comprises a convolution feature extraction network N1, a transform network N3 combined with edge enhancement and Gaussian position coding and a segmentation network N4 which are sequentially connected.
Step 130: and inputting the remote sensing image to be detected into the target remote sensing image segmentation model to obtain a target image segmentation result of the remote sensing image to be detected.
Wherein, (1) the remote sensing image to be measured is: and (4) randomly selecting the remote sensing image. (2) The target image segmentation result comprises: and (5) carrying out multi-class segmentation on the remote sensing image to be detected.
Preferably, the method further comprises the following steps:
and obtaining a plurality of remote sensing image samples, and labeling at least two categories in any remote sensing image sample to obtain a semantic labeled image corresponding to the remote sensing image sample until obtaining the semantic labeled image corresponding to each remote sensing image sample.
Wherein, the semantic annotation image is: remote sensing images comprising at least two categories.
It should be noted that the process of performing category labeling on the remote sensing image sample is as follows: at least two categories for distinguishing pixels are preset. For example, at least the object to be recognized and the background may be set to 2 categories. It can also be set as the object A, the object B and the background which need to be identified, and set as 3 categories. Taking building identification as an example, building and background labeling can be performed on each pixel in the remote sensing image, the building is identified, and then each pixel is labeled as a building or a background.
Preferably, as shown in fig. 2, step 120 includes:
step 121: inputting any remote sensing image sample into the convolution feature extraction network N1 to obtain an initial feature map corresponding to the remote sensing image sample, performing edge extraction on a semantic annotation image corresponding to the remote sensing image sample to obtain a first edge image corresponding to the remote sensing image sample, and inputting the first edge image into the edge semantic auxiliary network N2 to obtain an edge semantic feature map corresponding to the remote sensing image sample.
Wherein, (1) the convolution feature extraction network N1 may include: first convolution layer { C i 1 }(i∈{1,…,n c 1 },n c 1 Not less than 1); i is a variable, n c 1 Indicating the number of layers of the first buildup layer. (2) The initial characteristic diagram is as follows: the remote sensing image sample passes through the first convolution layer { C i 1 And (6) carrying out feature extraction to obtain a feature map. (3) The process of extracting the edge of the semantic annotation image to obtain the first edge image is the prior art and is not described herein in any greater detail. (4) The edge semantic assisting network N2 may include: edge vector v e Non-edge vector v ne Edge semantic layer { E i 2 }(i∈{1,…,n e 2 },n e 2 ≧ 1), gated layer { G i 2 }(i∈{1,…,n g 2 },n g 2 Not less than 0); i is a variable, n e 2 Number of layers, n, representing edge semantic layer g 2 Indicating the number of gated layers. Edge vector v e And a non-edge vector v ne Is a learnable vector. (5) The edge semantic feature map is: and (3) processing the first edge image by an edge semantic auxiliary network N2 to obtain a feature map.
It should be noted that: (1) edge semantic layer E in edge semantic assisted network N2 i The input is a first edge map and an edge vector v e And a non-edge vector v ne The output is an edge semantic feature map E if . Firstly, performing edge extraction on a semantic annotation image corresponding to an input image to obtain a first edge image; then, performing expansion operation on the first edge image to obtain an edge expansion image; then, it is determined whether the pixel (i, j) in the edge-expanded image is 0, and if 0, the non-edge vector v is used ne For the initial edge semantic feature map E if Is assigned and if not 0, the edge vector v is used e For the initial edge semantic feature map E if The pixel (i, j) of (a) is assigned; based on this, until the initial edge semantic feature map E if After each pixel in the graph is assigned, a required edge semantic feature graph E can be obtained if . (2) Gating layer G in edge semantic assistance network N2 i For updating edge semantic feature map E if The input is an edge expansion image and an edge semantic feature map E if And a feature map EGTB output by an edge position transformer module in a transformer network N3 combining edge enhancement and Gaussian position coding if The output is an updated edge semantic feature map E (i+1)f . Firstly, judging whether a pixel (i, j) in the edge expansion image is 0, if so, not updating the value of the pixel (i, j) in the edge semantic feature map, and if not, then E is judged if And EGTB if Adding the values of the middle pixels (i, j) to obtain an updated edge semantic feature map E (i+1)f The value of pixel (i, j); based on the above, the updated edge semantic feature graph E can be obtained (i+1)f
Step 122: inputting the initial characteristic graph and the edge semantic characteristic graph corresponding to any remote sensing image sample into the transform network N3 combining edge enhancement and Gaussian position coding to obtain an enhanced characteristic graph corresponding to the remote sensing image sample, and inputting the enhanced characteristic graph corresponding to the remote sensing image sample into the segmentation network N4 to obtain a first image segmentation result of the remote sensing image sample.
Wherein, (1) the transform network N3 combining edge enhancement and gaussian position coding may include: edge position transformer module { EGTB i 3 }(i∈{1,...,n egtb 3 },n egtb 3 ≧ 1), downsampled layer { D i 3 }(i∈{1,...,n d 3 },nd 3 Not less than 0); i is a variable, n egtb 3 Indicates the number of edge position transform modules, n d 3 The number of layers of the down-sampling layer is indicated. Each edge position transformer module comprises a fusion layer M, a position encoding layer P, a multi-head attention layer MA, a full link layer FC, and an addition layer { a first addition layer A 1 Second additional layer A 2 Third phase addition layer A 3 }. (2) The enhanced feature map is: and processing the initial feature map and the edge semantic feature map by the transform network N3 combining edge enhancement and Gaussian position coding to obtain a feature map. (3) The split network N4 may include a second convolutional layer { C i 4 Or S i 4 }(i∈{1,...,n c 4 },n c 4 Not less than 1), upsampling layer { U ≧ 1) i 4 }(i∈{1,...,n u 4 },n u 4 Not less than 0); i is a variable, n c 4 Number of layers of the convolutional layer, n u 4 The number of upsampled layers is indicated. (4) The first image segmentation result is: and (4) segmenting the score map of multiple classes corresponding to the remote sensing image sample.
It should be noted that: combining the edge enhancement and the position coding layer P in the transform network N3 of Gaussian position coding, and calculating the position coding of each pixel (i, j) in the feature map by adopting K two-dimensional Gaussian functions, wherein the formula is as follows:
Figure BDA0003978288400000091
wherein p ∈ R K×d Is a learnable coding matrix, d is the dimension of the position code, mu 1 ∈R K 、μ 2 ∈R K Is a learnable mean vector, σ 1 ∈R K 、σ 2 ∈R K For a learnable standard deviation vector, ρ ∈ R K Is a learnable closeness parameter vector, omega is KAnd P is the finally obtained two-dimensional position code. By adopting a plurality of two-dimensional Gaussian distributions to calculate the position codes, the target distribution conditions of different positions in the image can be captured in a self-adaptive manner, and effective position distribution information is provided for fine and accurate segmentation of the image.
Step 123: and obtaining the loss value of the remote sensing image sample according to the first image segmentation result and the semantic annotation image corresponding to any remote sensing image sample until obtaining the loss value of each remote sensing image sample.
Specifically, a first image segmentation result corresponding to any remote sensing image sample is compared with a semantic annotation image, a loss value of the remote sensing image sample is obtained based on a loss function of a first remote sensing image segmentation model, and the above mode is repeated until the loss value of each remote sensing image sample is obtained.
Step 124: and optimizing the first remote sensing image segmentation model based on all loss values to obtain an optimized remote sensing image segmentation model, taking the optimized remote sensing image segmentation model as the first remote sensing image segmentation model, returning to the step 121, and determining the optimized remote sensing image segmentation model as the second remote sensing image segmentation model until preset iterative training conditions are met.
Wherein, (1) the preset iterative training condition is as follows: maximum iterative training times or loss function convergence, etc. (2) The optimized remote sensing image segmentation model comprises the following steps: and obtaining the remote sensing image segmentation model after one iteration training.
Specifically, parameters of the first remote sensing image segmentation model are optimized according to all loss values to obtain an optimized remote sensing image segmentation model, and whether the optimized remote sensing image segmentation model meets preset iterative training conditions or not is judged; if yes, determining the optimized remote sensing image segmentation model as a second remote sensing image segmentation model; if not, the optimized remote sensing image segmentation model is used as a first remote sensing image segmentation model and returns to the execution step 121 until the preset iterative training condition is met, and the optimized remote sensing image segmentation model is determined as a second remote sensing image segmentation model.
Preferably, the convolutional feature extraction network N1 includes: at least one first build-up layer; the step of inputting any remote sensing image sample into the convolution feature extraction network N1 to obtain an initial feature map corresponding to the remote sensing image sample comprises the following steps:
and inputting any remote sensing image sample into the convolution feature extraction network N1 to carry out feature extraction through each first convolution layer respectively to obtain an initial feature map corresponding to the remote sensing image sample.
Wherein fig. 3 shows a first structural diagram of a first remote sensing image segmentation model. As shown in fig. 3, the convolutional feature extraction network N1 includes: at least one convolution layer C 1 1
Preferably, the edge semantic assisting network N2 includes: and the edge vectors, the non-edge vectors and the edge semantic layer are sequentially connected.
As shown in fig. 3, the edge semantic auxiliary network N2 includes edge vectors v arranged in sequence e Non-edge vector v ne And an edge semantic layer E 1 2
Inputting the first edge image corresponding to any remote sensing image sample into the edge semantic auxiliary network N2 to obtain an edge semantic feature map corresponding to the remote sensing image sample, wherein the step comprises the following steps:
and inputting the first edge image, the edge vector and the non-edge vector corresponding to any remote sensing image sample into the edge semantic layer for feature extraction to obtain an edge semantic feature map corresponding to the remote sensing image sample.
Preferably, the transform network N3 combining edge enhancement and gaussian position coding comprises: at least one edge position transformer module, each edge position transformer module comprising: the device comprises a fusion layer, a position coding layer, a first additive layer, a multi-head attention layer, a second additive layer, a full-connection layer and a third additive layer which are sequentially arranged.
Wherein, as shown in fig. 3, the transform network N3 combining edge enhancement and gaussian position coding comprises an edge position transformAn er module, comprising: one fusion layer M arranged in sequence 1 3 A position-coding layer P 1 3 A first additive layer A 11 3 A multi-headed attention layer MA 1 3 A second additional layer A 12 3 A full connection layer FC 1 3 And a third additional layer A 13 3
Inputting the initial feature map and the edge semantic feature map corresponding to any remote sensing image sample into the transform network N3 combining edge enhancement and Gaussian position coding to obtain an enhanced feature map corresponding to the remote sensing image sample, wherein the step comprises the following steps:
inputting the initial characteristic diagram and the edge semantic characteristic diagram corresponding to any remote sensing image sample into a fusion layer of a first edge position transformer module for fusion to obtain a first fusion characteristic diagram corresponding to the remote sensing image sample, and carrying out position coding on each pixel point in the first fusion characteristic diagram corresponding to the remote sensing image sample through a position coding layer of the first edge position transformer module to obtain two-dimensional position coding information of the first fusion characteristic diagram.
And (1) the first fusion feature map is a feature map obtained by fusing an initial feature map and an edge semantic feature map corresponding to the remote sensing image sample by the fusion layer of the edge position transformer module. (2) Two-dimensional position-coding information according to the position-coding layer P 1 3 And carrying out position coding on the pixels in the first fused feature map.
Inputting the first fusion characteristic diagram corresponding to any remote sensing image sample and the two-dimensional position coding information of the first fusion characteristic diagram into the first addition layer of the first edge position transformer module for addition to obtain a first intermediate characteristic diagram corresponding to the remote sensing image sample.
It should be noted that, in the training stage, since the first remote sensing image segmentation model includes the edge semantic auxiliary network N2, the first fused feature map is obtained by fusing the initial feature map and the edge semantic feature map. In the testing stage, the target remote sensing image segmentation model does not contain the edge semantic auxiliary network N2, so that the first fusion feature map is an initial feature map.
And inputting a first middle feature map corresponding to any remote sensing image sample into a multi-head attention layer, a second additive layer, a full-link layer and a third additive layer of a first edge position transform module which are sequentially connected for processing to obtain a second middle feature map corresponding to the remote sensing image sample, taking the second middle feature map as an initial feature map of a next edge position transform module until the second middle feature map is processed by all the edge position transform modules to obtain an enhanced feature map corresponding to the remote sensing image sample.
Preferably, the split network N4 comprises: at least one second convolution layer.
As shown in fig. 3, the split network N4 includes: a convolution layer S 1 4
Inputting the enhanced feature map corresponding to any remote sensing image sample into the segmentation network N4 to obtain a first image segmentation result of the remote sensing image sample, wherein the step comprises the following steps:
and inputting the enhanced feature map corresponding to any remote sensing image sample into the segmentation network N4 to perform feature extraction through each second convolution layer respectively to obtain a first image segmentation result of the remote sensing image sample.
Further, fig. 4 shows a second block diagram of the first remote sensing image segmentation model. As shown in FIG. 4, the convolution feature extraction network N1 includes a first convolution layer C arranged in sequence 1 1 And a first convolution layer C 2 1 (ii) a That is, in the convolution feature extraction network N1, N is included in total c 1 =2 convolutional layers.
The edge semantic auxiliary network N2 comprises edge vectors v which are arranged in sequence e Non-edge vector v ne Edge semantic layer E 1 2 Gate control layer G 1 2 And a gate layer G 2 2 (ii) a I.e. in the edge semantic auxiliary network N2, N is included altogether e 2 =1 edge semantic layer, n g 2 =2 gated layers.
The transform network N3 combining edge enhancement and Gaussian position coding comprises an edge position transform module EGTB arranged in sequence 1 3 Down-sampling layer D 1 3 Edge position transformer module EGTB 2 3 Down-sampling layer D 2 3 Edge position transformer module EGTB 3 3 . Wherein each edge position transformer module comprises a fusion layer M which is arranged in sequence 1 3 Position-coding layer P 1 3 The first additive layer A 11 3 Multi-head attention layer MA 1 3 Second additional layer A 12 3 Full connection layer FC 1 3 And a third phase addition layer A 13 3 . That is, in the transform network N3 combining edge enhancement and Gaussian position coding, N is included in total 1 +n 2 +n 3 A fused layer of n 1 +n 2 +n 3 Position coding layer, n 1 +n 2 +n 3 Multiple head attention layer, n 1 +n 2 +n 3 Full connection layer, 3 × (n) 1 +n 2 +n 3 ) Add layers, 2 downsample layers; together include n egtb 3 =n 1 +n 2 +n 3 An EGTB module, and n d 3 =2 downsampled layers.
In the portion framed by the edge position transformer module in fig. 4, the fusion layer M may be formed 1 3 Position-coding layer P 1 3 Multi-head attention layer MA 1 3 Full connection layer FC 1 3 And addition layer { A 11 3 ,A 12 3 ,A 13 3 And the processing layer is used for extracting the global context information to obtain an enhanced feature map, and the processing layer also comprises other groups of same structures which are arranged in parallel.
The partition network N4 includes a second convolution layer C arranged in sequence 1 4 A second convolution layer C 2 4 Upper sampling layer U 1 4 A second convolution layer C 3 4 A second convolution layer C 4 4 Upper sampling layer U 2 4 A second convolution layer C 5 4 A second convolution layer C 6 4 And a second convolution layer S 1 4 . That is, in the divided network N4, N is included in total c 4 =7 convolutional layers, n u 4 =2 upsampled layers.
It should be noted that (1) the down-sampling layer can be implemented by using a pooling operation or a convolution operation with a step greater than 1 to perform dimensionality reduction on the features. The upsampling layer may be implemented using a transposed convolution operation or a bilinear interpolation operation or pooling-up operation to upscale the features. The fusion layer can be realized by using addition operation, concatenation operation or mean value operation to perform information fusion on a plurality of characteristics. (2) C represents a convolution layer with a convolution kernel of 3 × 3, and S represents a convolution layer with a convolution kernel of 1 × 1. (3) The convolutional layers in the convolutional feature extraction network N1 are used to extract an initial feature map with local context information. (4) The edge semantic layer in the edge semantic auxiliary network N2 is used for extracting an edge semantic feature map containing rich target edge information; the gating layer updates the edge semantic feature map based on the feature map that is continuously learned. (5) A fusion layer in a transform network N3 combining edge enhancement and Gaussian position coding is used for fully fusing an initial feature map and an edge semantic feature map; the position coding layer is used for adaptively capturing the target distribution conditions of different positions in the image and providing effective target position distribution information; the multi-head attention layer, the adding layer and the full connection layer are used for extracting global context information to obtain an enhanced feature map. (6) The upper sampling layer in the segmentation network N4 is used for ascending the dimension of the image characteristics to gradually reach the size of the original image; and the convolution layers are used for refining the characteristic graph, wherein the last convolution layer is used for generating a score map to obtain an image segmentation result.
The technical scheme of the embodiment is suitable for remote sensing image segmentation under complex scenes such as target dense distribution, segmentation processing is carried out on the remote sensing image through the convolution feature extraction network N1, the transform network N3 combining edge enhancement and Gaussian position coding and the segmentation network N4, the enhanced target edge information and two-dimensional position information in the network can be fully utilized, and the fine and accurate segmentation capability under the condition of target dense distribution in the remote sensing image is improved.
Fig. 5 shows a schematic structural diagram of an embodiment of a remote sensing image segmentation system provided by the invention. As shown in fig. 5, the system 200 includes: a model building module 210, a model training module 220, and an image segmentation module 230.
The model building module 210 is configured to: constructing a first remote sensing image segmentation model comprising a convolution feature extraction network N1, an edge semantic auxiliary network N2, a transform network N3 combining edge enhancement and Gaussian position coding and a segmentation network N4; the convolutional feature extraction network N1, the edge semantic auxiliary network N2 and the segmentation network N4 are respectively connected with the transform network N3 which combines edge enhancement and Gaussian position coding;
the model training module 220 is configured to: training the first remote sensing image segmentation model based on a plurality of remote sensing image samples to obtain a second remote sensing image segmentation model, and deleting the edge semantic auxiliary network N2 in the second remote sensing image segmentation model to obtain a target remote sensing image segmentation model;
the image segmentation module 230 is configured to: and inputting the remote sensing image to be detected into the target remote sensing image segmentation model to obtain a target image segmentation result of the remote sensing image to be detected.
The technical scheme of the embodiment is suitable for remote sensing image segmentation under complex scenes such as target dense distribution, segmentation processing is carried out on the remote sensing image through the convolution feature extraction network N1, the transform network N3 combining edge enhancement and Gaussian position coding and the segmentation network N4, the enhanced target edge information and two-dimensional position information in the network can be fully utilized, and the fine and accurate segmentation capability under the condition of target dense distribution in the remote sensing image is improved.
The above steps for realizing the corresponding functions of each parameter and each module in the remote sensing image segmentation system 200 of the present embodiment may refer to each parameter and step in the above embodiments of a remote sensing image segmentation method, which are not described herein again.
An embodiment of the present invention provides a storage medium, including: the storage medium stores instructions, and when the computer reads the instructions, the computer executes the steps of the remote sensing image segmentation method, which may specifically refer to each parameter and step in the above embodiment of the remote sensing image segmentation method, and details are not described here.
Computer storage media such as: flash disks, portable hard disks, and the like.
An electronic device provided in an embodiment of the present invention includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and is characterized in that when the processor executes the computer program, the computer executes steps of a remote sensing image segmentation method, which may specifically refer to each parameter and step in the above embodiment of a remote sensing image segmentation method, and are not described herein again.
Those skilled in the art will appreciate that the present invention may be embodied as methods, systems, storage media and electronic devices.
Thus, the present invention may be embodied in the form of: may be embodied entirely in hardware, entirely in software (including firmware, resident software, micro-code, etc.) or in a combination of hardware and software, and may be referred to herein generally as a "circuit," module "or" system. Furthermore, in some embodiments, the invention may also be embodied in the form of a computer program product in one or more computer-readable media having computer-readable program code embodied in the medium. Any combination of one or more computer-readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (10)

1. A remote sensing image segmentation method is characterized by comprising the following steps:
constructing a first remote sensing image segmentation model comprising a convolution feature extraction network, an edge semantic auxiliary network, a transform network combining edge enhancement and Gaussian position coding and a segmentation network; the convolutional feature extraction network, the edge semantic auxiliary network and the segmentation network are respectively connected with the transform network combining edge enhancement and Gaussian position coding;
training the first remote sensing image segmentation model based on a plurality of remote sensing image samples to obtain a second remote sensing image segmentation model, and deleting the edge semantic auxiliary network in the second remote sensing image segmentation model to obtain a target remote sensing image segmentation model;
and inputting the remote sensing image to be detected into the target remote sensing image segmentation model to obtain a target image segmentation result of the remote sensing image to be detected.
2. The remote sensing image segmentation method according to claim 1, further comprising:
and obtaining a plurality of remote sensing image samples, and labeling at least two categories in any remote sensing image sample to obtain a semantic labeled image corresponding to the remote sensing image sample until obtaining the semantic labeled image corresponding to each remote sensing image sample.
3. The remote sensing image segmentation method according to claim 2, wherein the step of training the first remote sensing image segmentation model based on the plurality of remote sensing image samples to obtain a second remote sensing image segmentation model comprises:
inputting any remote sensing image sample into the convolution feature extraction network to obtain an initial feature map corresponding to the remote sensing image sample, performing edge extraction on a semantic annotation image corresponding to the remote sensing image sample to obtain a first edge image corresponding to the remote sensing image sample, and inputting the first edge image into the edge semantic auxiliary network to obtain an edge semantic feature map corresponding to the remote sensing image sample;
inputting an initial characteristic diagram and an edge semantic characteristic diagram corresponding to any remote sensing image sample into the transform network combining edge enhancement and Gaussian position coding to obtain and input an enhanced characteristic diagram corresponding to the remote sensing image sample into the segmentation network to obtain a first image segmentation result of the remote sensing image sample;
obtaining a loss value of each remote sensing image sample according to a first image segmentation result and a semantic annotation image corresponding to any remote sensing image sample until obtaining the loss value of each remote sensing image sample;
and optimizing the first remote sensing image segmentation model based on all loss values to obtain an optimized remote sensing image segmentation model, taking the optimized remote sensing image segmentation model as the first remote sensing image segmentation model, returning to execute the step of inputting any remote sensing image sample into the convolution feature extraction network, and determining the optimized remote sensing image segmentation model as the second remote sensing image segmentation model until preset iteration training conditions are met.
4. The remote sensing image segmentation method according to claim 3, wherein the convolutional feature extraction network includes: at least one first build-up layer; the step of inputting any remote sensing image sample into the convolution feature extraction network to obtain an initial feature map corresponding to the remote sensing image sample comprises the following steps:
and inputting any remote sensing image sample into the convolution feature extraction network to carry out feature extraction through each first convolution layer respectively to obtain an initial feature map corresponding to the remote sensing image sample.
5. The remote sensing image segmentation method according to claim 3, wherein the edge semantic auxiliary network includes: the edge vectors, the non-edge vectors and the edge semantic layer are sequentially connected; inputting a first edge image corresponding to any remote sensing image sample into the edge semantic auxiliary network to obtain an edge semantic feature map corresponding to the remote sensing image sample, wherein the step comprises the following steps:
inputting a first edge image, an edge vector and a non-edge vector corresponding to any remote sensing image sample into the edge semantic layer for feature extraction to obtain an edge semantic feature map corresponding to the remote sensing image sample.
6. The remote sensing image segmentation method according to claim 3, wherein the transform network combining edge enhancement and Gaussian position coding comprises: at least one edge position transformer module, each edge position transformer module comprising: the device comprises a fusion layer, a position coding layer, a first additive layer, a multi-head attention layer, a second additive layer, a full-connection layer and a third additive layer which are arranged in sequence; inputting the initial feature map and the edge semantic feature map corresponding to any remote sensing image sample into the transform network combining edge enhancement and Gaussian position coding to obtain an enhanced feature map corresponding to the remote sensing image sample, wherein the step comprises the following steps:
inputting an initial feature map and an edge semantic feature map corresponding to any remote sensing image sample into a fusion layer of a first edge position transformer module for fusion to obtain a first fusion feature map corresponding to the remote sensing image sample, and performing position coding on each pixel point in the first fusion feature map corresponding to the remote sensing image sample through a position coding layer of the first edge position transformer module to obtain two-dimensional position coding information of the first fusion feature map;
inputting a first fusion characteristic diagram corresponding to any remote sensing image sample and two-dimensional position coding information of the first fusion characteristic diagram into a first addition layer of the first edge position transformer module for addition to obtain a first intermediate characteristic diagram corresponding to the remote sensing image sample;
and inputting a first middle feature map corresponding to any remote sensing image sample into a multi-head attention layer, a second additive layer, a full-link layer and a third additive layer of a first edge position transform module which are sequentially connected for processing to obtain a second middle feature map corresponding to the remote sensing image sample, taking the second middle feature map as an initial feature map of a next edge position transform module until the second middle feature map is processed by all the edge position transform modules to obtain an enhanced feature map corresponding to the remote sensing image sample.
7. A remote sensing image segmentation method as claimed in claim 3, wherein the segmentation network comprises: at least one second convolutional layer; inputting the enhanced feature map corresponding to any remote sensing image sample into the segmentation network to obtain a first image segmentation result of the remote sensing image sample, wherein the step comprises the following steps:
and inputting the enhanced feature map corresponding to any remote sensing image sample into the segmentation network to perform feature extraction through each second convolution layer respectively to obtain a first image segmentation result of the remote sensing image sample.
8. A remote sensing image segmentation system, comprising: the system comprises a model construction module, a model training module and an image segmentation module;
the model building module is configured to: constructing a first remote sensing image segmentation model comprising a convolution feature extraction network, an edge semantic auxiliary network, a transform network combining edge enhancement and Gaussian position coding and a segmentation network; wherein the convolutional feature extraction network, the edge semantic auxiliary network and the segmentation network are respectively connected with the transform network combining edge enhancement and Gaussian position coding;
the model training module is configured to: training the first remote sensing image segmentation model based on a plurality of remote sensing image samples to obtain a second remote sensing image segmentation model, and deleting the edge semantic auxiliary network in the second remote sensing image segmentation model to obtain a target remote sensing image segmentation model;
the image segmentation module is to: and inputting the remote sensing image to be detected into the target remote sensing image segmentation model to obtain a target image segmentation result of the remote sensing image to be detected.
9. A storage medium, characterized in that instructions are stored therein, which when read by a computer, cause the computer to carry out the remote sensing image segmentation method according to any one of claims 1 to 7.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the computer program, causes the computer to perform the method of segmentation of remote sensing images according to any one of claims 1 to 7.
CN202211542414.8A 2022-12-02 2022-12-02 Remote sensing image segmentation method, remote sensing image segmentation system, storage medium and electronic equipment Active CN115797633B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211542414.8A CN115797633B (en) 2022-12-02 2022-12-02 Remote sensing image segmentation method, remote sensing image segmentation system, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211542414.8A CN115797633B (en) 2022-12-02 2022-12-02 Remote sensing image segmentation method, remote sensing image segmentation system, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN115797633A true CN115797633A (en) 2023-03-14
CN115797633B CN115797633B (en) 2023-06-27

Family

ID=85445250

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211542414.8A Active CN115797633B (en) 2022-12-02 2022-12-02 Remote sensing image segmentation method, remote sensing image segmentation system, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN115797633B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107610141A (en) * 2017-09-05 2018-01-19 华南理工大学 A kind of remote sensing images semantic segmentation method based on deep learning
CN110059768A (en) * 2019-04-30 2019-07-26 福州大学 The semantic segmentation method and system of the merging point and provincial characteristics that understand for streetscape
CN110443822A (en) * 2019-07-16 2019-11-12 浙江工业大学 A kind of high score remote sensing target fine extracting method of semanteme edge auxiliary
CN111462126A (en) * 2020-04-08 2020-07-28 武汉大学 Semantic image segmentation method and system based on edge enhancement
CN113554655A (en) * 2021-07-13 2021-10-26 中国科学院空间应用工程与技术中心 Optical remote sensing image segmentation method and device based on multi-feature enhancement
CN114140480A (en) * 2021-12-09 2022-03-04 安徽大学 Thermal infrared electrical equipment image semantic segmentation method based on edge-assisted learning
CN114596520A (en) * 2022-02-09 2022-06-07 天津大学 First visual angle video action identification method and device
CN114677349A (en) * 2022-03-25 2022-06-28 西安交通大学 Image segmentation method and system for edge information enhancement and attention guidance of encoding and decoding

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107610141A (en) * 2017-09-05 2018-01-19 华南理工大学 A kind of remote sensing images semantic segmentation method based on deep learning
CN110059768A (en) * 2019-04-30 2019-07-26 福州大学 The semantic segmentation method and system of the merging point and provincial characteristics that understand for streetscape
CN110443822A (en) * 2019-07-16 2019-11-12 浙江工业大学 A kind of high score remote sensing target fine extracting method of semanteme edge auxiliary
CN111462126A (en) * 2020-04-08 2020-07-28 武汉大学 Semantic image segmentation method and system based on edge enhancement
CN113554655A (en) * 2021-07-13 2021-10-26 中国科学院空间应用工程与技术中心 Optical remote sensing image segmentation method and device based on multi-feature enhancement
CN114140480A (en) * 2021-12-09 2022-03-04 安徽大学 Thermal infrared electrical equipment image semantic segmentation method based on edge-assisted learning
CN114596520A (en) * 2022-02-09 2022-06-07 天津大学 First visual angle video action identification method and device
CN114677349A (en) * 2022-03-25 2022-06-28 西安交通大学 Image segmentation method and system for edge information enhancement and attention guidance of encoding and decoding

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CHEN Z, ET AL.: "A building change detection method for high-resolution remote sensing imagery based on edge guidance and differential enhancement", 《ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING》, pages 203 - 222 *
LI Y ET AL.: "A Y-Net deep learning method for road segmentation using high-resolution visible remote sensing images", 《REMOTE SENSING LETTERS》, pages 381 - 390 *
栾晓梅 等: "基于边缘增强的遥感图像弱监督语义分割方法", 《计算机工程与应用》, vol. 58, no. 20, pages 188 - 196 *
梁礼明 等: "融合多尺度Transformer的皮肤病变分割算法", 《吉林大学学报(工学版)》, pages 1 - 13 *

Also Published As

Publication number Publication date
CN115797633B (en) 2023-06-27

Similar Documents

Publication Publication Date Title
CN112560876B (en) Single-stage small sample target detection method for decoupling measurement
US11200424B2 (en) Space-time memory network for locating target object in video content
CN108664981B (en) Salient image extraction method and device
JP2015079505A (en) Noise identification method and noise identification device of parallax depth image
CN113343982B (en) Entity relation extraction method, device and equipment for multi-modal feature fusion
KR20220153667A (en) Feature extraction methods, devices, electronic devices, storage media and computer programs
CN116228792A (en) Medical image segmentation method, system and electronic device
Dong et al. Learning regional purity for instance segmentation on 3d point clouds
CN117078930A (en) Medical image segmentation method based on boundary sensing and attention mechanism
CN111612075A (en) Interest point and descriptor extraction method based on joint feature recombination and feature mixing
CN114550014A (en) Road segmentation method and computer device
CN116563285B (en) Focus characteristic identifying and dividing method and system based on full neural network
CN116778164A (en) Semantic segmentation method for improving deep V < 3+ > network based on multi-scale structure
WO2022257602A1 (en) Video object segmentation method and apparatus, storage medium, and electronic device
CN113554655B (en) Optical remote sensing image segmentation method and device based on multi-feature enhancement
CN115797633B (en) Remote sensing image segmentation method, remote sensing image segmentation system, storage medium and electronic equipment
CN113222016B (en) Change detection method and device based on cross enhancement of high-level and low-level features
CN115810152A (en) Remote sensing image change detection method and device based on graph convolution and computer equipment
CN113780305B (en) Significance target detection method based on interaction of two clues
CN114998630A (en) Ground-to-air image registration method from coarse to fine
CN114792370A (en) Whole lung image segmentation method and device, electronic equipment and storage medium
CN114842066A (en) Image depth recognition model training method, image depth recognition method and device
CN111539922B (en) Monocular depth estimation and surface normal vector estimation method based on multitask network
CN113610856A (en) Method and device for training image segmentation model and image segmentation
CN114066841A (en) Sky detection method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant