CN112837320B - Remote sensing image semantic segmentation method based on parallel hole convolution - Google Patents

Remote sensing image semantic segmentation method based on parallel hole convolution Download PDF

Info

Publication number
CN112837320B
CN112837320B CN202110129416.3A CN202110129416A CN112837320B CN 112837320 B CN112837320 B CN 112837320B CN 202110129416 A CN202110129416 A CN 202110129416A CN 112837320 B CN112837320 B CN 112837320B
Authority
CN
China
Prior art keywords
remote sensing
convolution
sensing image
network
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110129416.3A
Other languages
Chinese (zh)
Other versions
CN112837320A (en
Inventor
张东映
唐振超
罗蔚然
洪志明
梁忠壮
刘震
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202110129416.3A priority Critical patent/CN112837320B/en
Publication of CN112837320A publication Critical patent/CN112837320A/en
Application granted granted Critical
Publication of CN112837320B publication Critical patent/CN112837320B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • G06T5/30Erosion or dilatation, e.g. thinning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a remote sensing image semantic segmentation method based on parallel cavity convolution, which relates to the technical field of remote sensing images and comprises the following steps of: the method comprises the steps of obtaining a high-resolution remote sensing image in advance, slicing the high-resolution remote sensing image, normalizing and standardizing the high-resolution remote sensing image, and obtaining a source high-resolution remote sensing image; based on a pre-trained resnet101 parameter initialization feature on an ImageNet, a low-level network of a network resnet101 is extracted, a parallel cavity convolution network is constructed, and shallow-level features of a source high-resolution remote sensing image are extracted; inputting shallow features into a parallel cavity convolution network to acquire multi-scale information, and fusing the multi-scale information; and re-fusing the fused features with the shallow features, and repairing image-level information by using the full-connection conditional random field to obtain a semantic segmentation result. The invention expands the convolution receptive field without adding extra parameters, and compared with the standard convolution reaching the same receptive field, the parallel cavity convolution method can save the video memory.

Description

Remote sensing image semantic segmentation method based on parallel hole convolution
Technical Field
The invention relates to the technical field of remote sensing images, in particular to a remote sensing image semantic segmentation method based on parallel cavity convolution.
Background
With the maturation and commercialization of satellite remote sensing technology and the encouragement and promotion of governments around the world, satellite remote sensing is rapidly developed and applied in more and more fields. Semantic segmentation of remote sensing images is an important link of satellite remote sensing application. Semantic segmentation of remote sensing images is widely used in the pattern recognition fields of city planning, road planning, ground object target extraction, even automatic driving and the like. The improvement of semantic segmentation accuracy has important significance in the processing of remote sensing images.
The ground object information in the remote sensing image is complex and various, and in order to improve the semantic segmentation precision of the remote sensing image, related scholars have developed a great deal of research and put forward a plurality of algorithms. The thought of the algorithms mainly comprises (a) carrying out semantic segmentation on the remote sensing image by adopting a full convolution network, (b) fusing feature information with symmetrical dimensions on the basis of the full convolution network, and recording indexes in the process of reverse pooling so as to make up for the loss of position information. All the methods are based on standard convolution, and the standard convolution has limitations on receptive fields, so that the expanded receptive fields have important research and application values for semantic segmentation of remote sensing images.
For the problems in the related art, no effective solution has been proposed at present.
Disclosure of Invention
Aiming at the problems in the related art, the invention provides a remote sensing image semantic segmentation method based on parallel cavity convolution, which aims to overcome the technical problems existing in the prior art.
The technical scheme of the invention is realized as follows:
a remote sensing image semantic segmentation method based on parallel hole convolution comprises the following steps:
the method comprises the steps of obtaining a high-resolution remote sensing image in advance, slicing the high-resolution remote sensing image, normalizing and standardizing the high-resolution remote sensing image, and obtaining a source high-resolution remote sensing image;
based on a pre-trained resnet101 parameter initialization feature on an ImageNet, a low-level network of a network resnet101 is extracted, a parallel cavity convolution network is constructed, and shallow-level features of a source high-resolution remote sensing image are extracted;
inputting shallow features into a parallel cavity convolution network to obtain multi-scale information, and fusing the multi-scale information, wherein the multi-scale information is captured by setting different expansion rates;
and re-fusing the fused features with the shallow features, and repairing image-level information by using the full-connection conditional random field to obtain a semantic segmentation result.
Further, slicing the high resolution remote sensing image includes slicing the high resolution remote sensing image to a length and width of 512 pixels.
Further, the method also comprises the following steps:
and extracting RGB three channels from the high-resolution remote sensing image based on the slicing.
Further, the parallel hole convolution network comprises the following steps:
starting from a standard normal convolution, the calibration has a discrete function F:and have->And k: omega shape r And the convolution calculation process taking p as a center and expanding is as follows:
by expanding the standard convolution, the expansion rate of the cavity convolution is l, and the cavity convolution is:
and (3) carrying out cavity convolution on the shallow features in parallel through different expansion rates to obtain multi-scale features, and fusing the multi-scale features in a splicing mode to form a parallel cavity convolution network layer.
Further, the expansion ratios were set to 2,3,4 and 5, respectively.
Further, the full-connection conditional random field repair image level information comprises the following steps:
the energy function used by the fully connected conditional random field is:
the unitary potential energy function is used for describing the influence of an observed object and a label:
θ i (x i )=-log P(x i )
wherein, pixel points i, P (x i ) For the probability of a network classifying over a pixel, a binary potential energy function describes the correlation between observed objects:
when x is i ≠y j When u (x) i ,y j ) =1, otherwise, u (x i ,y j ) =0, additionally k m (f i ,f j ) As f i And f j Gaussian kernel in between, f i Is the color information corresponding to pixel i, i.e. the feature vector, w m Is the weight used by the gaussian kernel;
in the process of minimizing the energy function, the unreasonable classification pixels in the image can be corrected, and the repaired semantic segmentation result is obtained.
The invention has the beneficial effects that:
according to the remote sensing image semantic segmentation method based on parallel hole convolution, a high-resolution remote sensing image is obtained in advance, the high-resolution remote sensing image is sliced, normalization and standardization are carried out, a source high-resolution remote sensing image is obtained, a low-layer network of a network resnet101 is extracted based on a resnet101 parameter initialization feature pre-trained on an ImageNet, a parallel hole convolution network is constructed, shallow layer features of the source high-resolution remote sensing image are extracted, the shallow layer features are input into the parallel hole convolution network to obtain multi-scale information, the multi-scale information is fused, the fused features are fused with the shallow layer features again, and image-level information is restored by using a full-connection condition random field to obtain semantic segmentation result, so that a convolution receptive field is enlarged under the condition that additional parameters are not increased, and compared with standard convolution reaching the same receptive field, the parallel hole convolution method can save display memory; the parallel computing structure is adopted, so that nodes in the neural network computing graph can be conveniently distributed on distributed hardware, and the computing speed is improved; the multi-scale information is beneficial to capturing detail objects and large objects by a network, small target objects are not easy to miss, and semantic segmentation precision is improved; in addition, the cavity convolution can widely sense the adjacent object of the target object, pixel-level classification can be effectively carried out by means of the adjacent information, and the method has better pixel-level classification effect compared with standard convolution.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of hole convolution sampling with different expansion rates for a remote sensing image semantic segmentation method based on parallel hole convolution according to an embodiment of the invention;
FIG. 2 is a schematic diagram of a multi-scale parallel hole convolution network of a remote sensing image semantic segmentation method based on parallel hole convolution according to an embodiment of the invention;
fig. 3 is a parallel hole convolution semantic segmentation result of a remote sensing image semantic segmentation method based on parallel hole convolution according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which are derived by a person skilled in the art based on the embodiments of the invention, fall within the scope of protection of the invention.
According to the embodiment of the invention, a remote sensing image semantic segmentation method based on parallel hole convolution is provided.
As shown in fig. 1-3, the remote sensing image semantic segmentation method based on parallel hole convolution according to an embodiment of the present invention includes the following steps:
the method comprises the steps of obtaining a high-resolution remote sensing image in advance, slicing the high-resolution remote sensing image, normalizing and standardizing the high-resolution remote sensing image, and obtaining a source high-resolution remote sensing image;
based on a pre-trained resnet101 parameter initialization feature on an ImageNet, a low-level network of a network resnet101 is extracted, a parallel cavity convolution network is constructed, and shallow-level features of a source high-resolution remote sensing image are extracted;
inputting shallow features into a parallel cavity convolution network to obtain multi-scale information, and fusing the multi-scale information, wherein the multi-scale information is captured by setting different expansion rates;
and re-fusing the fused features with the shallow features, and repairing image-level information by using the full-connection conditional random field to obtain a semantic segmentation result.
The slicing of the high-resolution remote sensing image comprises slicing the high-resolution remote sensing image until the length and the width are 512 pixels.
Wherein, still include the following step:
and extracting RGB three channels from the high-resolution remote sensing image based on the slicing.
Wherein, the parallel cavity convolution network comprises the following steps:
starting from a standard normal convolution, the calibration has a discrete function F:and have->And k: omega shape r And the convolution calculation process taking p as a center and expanding is as follows:
by expanding the standard convolution, the expansion rate of the cavity convolution is l, and the cavity convolution is:
and (3) carrying out cavity convolution on the shallow features in parallel through different expansion rates to obtain multi-scale features, and fusing the multi-scale features in a splicing mode to form a parallel cavity convolution network layer.
Wherein the expansion rates are set to 2,3,4 and 5, respectively.
The full-connection conditional random field restoration image level information comprises the following steps:
the energy function used by the fully connected conditional random field is:
the unitary potential energy function is used for describing the influence of an observed object and a label:
θ i (x i )=-log P(x i )
wherein, pixel points i, P (x i ) Classifying pixels for a networkA binary potential energy function describes the correlation between observed objects:
when x is i ≠y j When u (x) i ,y j ) =1, otherwise, u (x i ,y j ) =0, additionally k m (f i ,f j ) As f i And f j Gaussian kernel in between, f i Is the color information corresponding to pixel i, i.e. the feature vector, w m Is the weight used by the gaussian kernel.
By means of the technical scheme, the pretrained resnet101 is obtained by slicing the high-resolution remote sensing image, and the shallow features of the sliced image are extracted by taking low-level migration as a feature extraction network; constructing parallel hole convolution, respectively setting the expansion rate of a convolution kernel to be 2 to 5, inputting shallow features into a parallel hole convolution network, and splicing information under different scales; re-fusing the output characteristics of the cavity convolution network with the shallow layer characteristics, up-sampling to recover the resolution, and repairing the segmentation result by using a conditional random field; and merging the segmentation results of the slices, and repairing unreasonable prediction results by simply filling holes and removing small connected domains.
Specifically, in one embodiment, the method includes the steps of:
s1: preprocessing a high-resolution remote sensing image, wherein the resolution of the high-resolution remote sensing image is too high, the memory and the video memory of a general computer are not easy to bear the calculation of the whole image, and the image is sliced to 512 pixels long and wide according to the 512 resolution commonly used in a main stream semantic segmentation model;
s2: to be compatible with conventional deep convolutional neural networks, three RGB channels need to be extracted from the sliced remote sensing image. Conventional data enhancement was performed: random horizontal overturn, random vertical overturn and color dithering. When the data is enhanced, the labeling image also carries out the same processing along with the RGB image;
s3: properly scaling the RGB three-channel tensorAnd (5) putting, namely normalizing. Assuming a total of m RGB images in the dataset, these RGB images may be divided into 3 channel tensors x 1 ,x 2 ,x 3 ]Normalization of tensors gives [ y ] 1 ,y 2 ,y 3 ]The tensor normalization formula for each channel is:
s4: and then normalized according to the mean value mu and standard deviation sigma of each channel to obtain tensor [ z ] 1 ,z 2 ,z 3 ]The normalized calculation formula is:
s5: based on the network initialized by the resnet101, intercepting the layers 1 to 4, wherein the hole convolution expansion rate of the layer4 is 2, and the hole convolution expansion rates of the layers 1 to 3 are 1, which is equivalent to standard common convolution;
s6: the method comprises the steps of carrying out cavity space pyramid convolution on features output by the resnet101, carrying out parallel convolution with different expansion rates, and replacing global pooled branches by standard convolution instead of global pooled branches, so as to obtain semantic information deeply and improve classification accuracy;
s7: the jump level structure is used for fusing the low-level features generated by layer1 in the resnet101 with the spatial pyramid convolution result after linear interpolation, the low-level features can bring partial position information to the high-level features, and as global pooling is cancelled in the spatial pyramid convolution layer, the position information of the image-level features is lost in the network, the rough segmentation result output by the network is required to be subjected to post-processing based on a conditional random field;
s8: calculating loss by using cross entropy, wherein the object distribution of the remote sensing image is unbalanced, so that weight is added to each class of object during cross entropy calculation, calculating gradient is counter-propagated in a calculation graph through the loss function, and network parameters are updated;
s9: the optimization method of model training adopts Adadelta, and the initial learning rate is set to be 1e -1
Adadelta can achieve a faster effect in the early stages of training. The feature extraction backbone network of the model is a resnet101, and although specific object information of the remote sensing image cannot be directly detected by the resnet101 pre-trained on the ImageNet, low-level information such as edges, angles and colors can be effectively perceived, so that the feature extraction layer of the network can be initialized by using the resnet101 parameters pre-trained on the ImageNet, and a good initial solution can be obtained by the network; random initialization obeying Gaussian distribution is carried out on other layer parameters of the network;
s10: the model can be converged after traversing the whole data set 256 times, the batch size is set to be 8, and the total iteration number of model training is 5e 4
S11: the high-resolution remote sensing image can not be segmented at one time, so that the slices are needed to be segmented semantically one by one, and unreasonable prediction results are restored by simply filling holes and removing small connected domains when the slices are spliced.
In addition, as shown in fig. 1, (a)/(b)/(c) shows that the hole convolutions with different expansion rates are sampled on the characteristics, and the expansion rates are 1,2 and 3, respectively, as can be seen in fig. 1, the receptive field increases with the increase of the expansion rate of the hole convolutions. The cavity convolution can be performed by sparse sampling on the features through setting the expansion rate, and can be performed by using any expansion rate, so that the method is beneficial to definitely controlling the receptive field and acquiring the context information in a dense calculation task. The setting of the hole convolution expansion rate does not affect the structure of the original network parameters, which is friendly to transfer learning, so that fine adjustment can be performed based on the original network parameters after the expansion rate is set.
In addition, according to steps S1 to S4, the GID high-resolution remote sensing image is sliced to a resolution of 512, normalized and standardized, and the image of the statistical dataset can obtain the normalized image with the average value of 3 channels of RGB as follows: 0.3515224,0.38427463,0.35403764. The standard deviation is: 0.19264674,0.18325084,0.17028946.
In addition, as shown in fig. 2, according to steps S5 and S6, the pre-trained resnet101 parameter initialization feature on ImageNet is used to extract the lower network of the network resnet101, the lower network can effectively detect the position information of edges, angles and the like, and a parallel cavity convolution network is constructed, and the expansion rates are respectively set to 2,3,4 and 5. The template parameters of the convolution kernel are tensors formed by the sizes of (3, 3). The shallow features are input into a parallel cavity convolution network to obtain multi-scale information, the multi-scale information is fused in a splicing mode, and the calculation process of the feature input cavity convolution is shown in figure 2.
In addition, according to step S7, the fused features are fused with shallow features again to make up for the position detail information, the resolution is recovered by means of upsampling, the semantic segmentation result obtained at this time is rough, and the image-level information needs to be repaired by combining with the original image, namely, the image-level features are repaired, so that the semantic segmentation result is improved.
In addition, according to steps S8 to S10, the forward computation is performed by traversing the data set, and after each batch process is completed, the loss function is updated, and the loss function is counted according to the pixel class to obtain the proportion of each class, and is fused into the cross entropy loss as the coefficient of each class loss. And starting from the node of the loss function, reversely calculating in the calculation graph, acquiring the gradient, and updating the model parameters. The model updating optimizer is Adadelta, which can obtain a quicker effect in the early middle period of training.
In addition, as shown in fig. 3, after training is finished, parameters of the semantic segmentation network are obtained, and the parameters are loaded into the network of the corresponding structure during reasoning. And (3) carrying out semantic segmentation on each slice of the high-resolution remote sensing image, wherein (a)/(b)/(c) in fig. 3 respectively represents the original picture of the slice, the real label corresponding to the slice and the semantic segmentation result of the parallel cavity convolution. Different clutter objects are represented using different pixels. The figure 3 shows that the remote sensing image semantic segmentation method based on parallel cavity convolution achieves good effect, and the segmentation result is close to the real annotation.
In addition, according to step S11, the semantic segmentation results of the respective slices are consolidated, and when the respective slices are spliced, unreasonable prediction results are repaired by simply filling holes and removing small connected domains.
In summary, by means of the above technical solution of the present invention, by acquiring the high resolution remote sensing image in advance, slicing the high resolution remote sensing image, normalizing and standardizing the high resolution remote sensing image, acquiring the source high resolution remote sensing image, extracting the low-level network of the network resnet101 based on the pre-trained resnet101 parameter initialization feature on the ImageNet, constructing a parallel cavity convolution network, extracting the shallow features of the source high resolution remote sensing image, inputting the shallow features into the parallel cavity convolution network to acquire multi-scale information, fusing the fused features with the shallow features again, repairing the image-level information by using the full-connection condition random field, acquiring the semantic segmentation result, and realizing that the sense field of convolution is enlarged without adding additional parameters. The parallel computing structure is adopted, so that nodes in the neural network computing graph can be conveniently distributed on distributed hardware, and the computing speed is improved; the multi-scale information is beneficial to capturing detail objects and large objects by a network, small target objects are not easy to miss, and semantic segmentation precision is improved; in addition, the cavity convolution can widely sense the adjacent object of the target object, pixel-level classification can be effectively carried out by means of the adjacent information, and the method has better pixel-level classification effect compared with standard convolution.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

Claims (1)

1. A remote sensing image semantic segmentation method based on parallel hole convolution is characterized by comprising the following steps:
the method comprises the steps of obtaining a high-resolution remote sensing image in advance, slicing the high-resolution remote sensing image, normalizing and standardizing the high-resolution remote sensing image, and obtaining a source high-resolution remote sensing image;
based on a pre-trained resnet101 parameter initialization feature on an ImageNet, a low-level network of a network resnet101 is extracted, a parallel cavity convolution network is constructed, and shallow-level features of a source high-resolution remote sensing image are extracted; based on the network initialized by the resnet101, intercepting the layers 1 to 4, wherein the hole convolution expansion rate of the layer4 is 2, and the hole convolution expansion rates of the layers 1 to 3 are 1, which is equivalent to standard common convolution;
inputting shallow features into a parallel cavity convolution network to obtain multi-scale information, and fusing the multi-scale information, wherein the multi-scale information is captured by setting different expansion rates; the method comprises the steps of carrying out cavity space pyramid convolution on features output by the resnet101, carrying out parallel convolution with different expansion rates, and replacing global pooled branches by standard convolution instead of global pooling; fusing the low-level features generated by layer1 in the resnet101 with the space pyramid convolution result after linear interpolation by using a jump level structure;
re-fusing the fused features with shallow features, and repairing image-level information by using a full-connection conditional random field to obtain a semantic segmentation result;
carrying out semantic segmentation on the normalized and standardized slices one by one, and repairing unreasonable prediction results by simply filling holes and removing small connected domains when the semantic segmentation results of all the slices are spliced;
the parallel hole convolution network comprises the following steps:
starting from a standard normal convolution, the calibration is performed with a discrete functionAnd have->Which is a kind ofFor discrete convolution kernels, p-centered deconvolution computationThe process is as follows:
by expanding the standard convolution, the expansion rate of the cavity convolution is l, and the cavity convolution is:
the shallow features are subjected to cavity convolution with different expansion rates in parallel to obtain multi-scale features, and the multi-scale features are fused in a splicing mode, so that a parallel cavity convolution network layer is formed;
the full connection conditional random field repair image level information comprises the following steps:
the energy function used by the fully connected conditional random field is:
the unitary potential energy function is used for describing the influence of an observed object and a label:
θ i (x i )=-logP(x i )
wherein, pixel points i, P (x i ) For the probability of a network classifying over a pixel, a binary potential energy function describes the correlation between observed objects:
when x is i ≠y j When u (x) i ,y j ) Otherwise, u (x i ,y j ) =0, additionally k m (f i ,f j ) As f i And f j Gaussian kernel in between, f i Is the color information corresponding to pixel i, i.e. feature vector, W m Is the weight used by the gaussian kernel;
slicing the high-resolution remote sensing image, wherein the slicing of the high-resolution remote sensing image comprises slicing the high-resolution remote sensing image until the length and the width are 512 pixels;
the method also comprises the following steps:
extracting RGB three channels from the sliced high-resolution remote sensing image;
the expansion ratios were set to 2,3,4 and 5, respectively.
CN202110129416.3A 2021-01-29 2021-01-29 Remote sensing image semantic segmentation method based on parallel hole convolution Active CN112837320B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110129416.3A CN112837320B (en) 2021-01-29 2021-01-29 Remote sensing image semantic segmentation method based on parallel hole convolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110129416.3A CN112837320B (en) 2021-01-29 2021-01-29 Remote sensing image semantic segmentation method based on parallel hole convolution

Publications (2)

Publication Number Publication Date
CN112837320A CN112837320A (en) 2021-05-25
CN112837320B true CN112837320B (en) 2023-10-27

Family

ID=75931168

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110129416.3A Active CN112837320B (en) 2021-01-29 2021-01-29 Remote sensing image semantic segmentation method based on parallel hole convolution

Country Status (1)

Country Link
CN (1) CN112837320B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486840B (en) * 2021-07-21 2022-08-30 武昌理工学院 Building rapid extraction method based on composite network correction
CN113780297B (en) * 2021-09-15 2024-03-12 北京百度网讯科技有限公司 Image processing method, device, equipment and storage medium
CN114067221B (en) * 2022-01-14 2022-04-15 成都数联云算科技有限公司 Remote sensing image woodland extraction method, system, device and medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062756A (en) * 2018-01-29 2018-05-22 重庆理工大学 Image, semantic dividing method based on the full convolutional network of depth and condition random field
CN108877832A (en) * 2018-05-29 2018-11-23 东华大学 A kind of audio sound quality also original system based on GAN
CN109285162A (en) * 2018-08-30 2019-01-29 杭州电子科技大学 A kind of image, semantic dividing method based on regional area conditional random field models
CN109461157A (en) * 2018-10-19 2019-03-12 苏州大学 Image, semantic dividing method based on multi-stage characteristics fusion and Gauss conditions random field
CN109741383A (en) * 2018-12-26 2019-05-10 西安电子科技大学 Picture depth estimating system and method based on empty convolution sum semi-supervised learning
CN110070022A (en) * 2019-04-16 2019-07-30 西北工业大学 A kind of natural scene material identification method based on image
CN110232394A (en) * 2018-03-06 2019-09-13 华南理工大学 A kind of multi-scale image semantic segmentation method
CN110781775A (en) * 2019-10-10 2020-02-11 武汉大学 Remote sensing image water body information accurate segmentation method supported by multi-scale features
CN111259828A (en) * 2020-01-20 2020-06-09 河海大学 High-resolution remote sensing image multi-feature-based identification method
CN111539959A (en) * 2020-07-13 2020-08-14 浙江省肿瘤医院(浙江省癌症中心) Thyroid nodule ultrasonic image processing method based on cross-layer sparse hole convolution
CN112183360A (en) * 2020-09-29 2021-01-05 上海交通大学 Lightweight semantic segmentation method for high-resolution remote sensing image

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2580671B (en) * 2019-01-22 2022-05-04 Toshiba Kk A computer vision system and method

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062756A (en) * 2018-01-29 2018-05-22 重庆理工大学 Image, semantic dividing method based on the full convolutional network of depth and condition random field
CN110232394A (en) * 2018-03-06 2019-09-13 华南理工大学 A kind of multi-scale image semantic segmentation method
CN108877832A (en) * 2018-05-29 2018-11-23 东华大学 A kind of audio sound quality also original system based on GAN
CN109285162A (en) * 2018-08-30 2019-01-29 杭州电子科技大学 A kind of image, semantic dividing method based on regional area conditional random field models
CN109461157A (en) * 2018-10-19 2019-03-12 苏州大学 Image, semantic dividing method based on multi-stage characteristics fusion and Gauss conditions random field
CN109741383A (en) * 2018-12-26 2019-05-10 西安电子科技大学 Picture depth estimating system and method based on empty convolution sum semi-supervised learning
CN110070022A (en) * 2019-04-16 2019-07-30 西北工业大学 A kind of natural scene material identification method based on image
CN110781775A (en) * 2019-10-10 2020-02-11 武汉大学 Remote sensing image water body information accurate segmentation method supported by multi-scale features
CN111259828A (en) * 2020-01-20 2020-06-09 河海大学 High-resolution remote sensing image multi-feature-based identification method
CN111539959A (en) * 2020-07-13 2020-08-14 浙江省肿瘤医院(浙江省癌症中心) Thyroid nodule ultrasonic image processing method based on cross-layer sparse hole convolution
CN112183360A (en) * 2020-09-29 2021-01-05 上海交通大学 Lightweight semantic segmentation method for high-resolution remote sensing image

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Multi-Receptive Atrous Convolutional Network for Semantic Segmentation;Mingyang Zhong等;《2020 International Joint Conference on Neural Network》;20200731;第1-8页 *
基于深度学习的命名实体识别方法研究;李平;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20200715(第07期);第I138-1592页 *

Also Published As

Publication number Publication date
CN112837320A (en) 2021-05-25

Similar Documents

Publication Publication Date Title
CN112837320B (en) Remote sensing image semantic segmentation method based on parallel hole convolution
CN110232394B (en) Multi-scale image semantic segmentation method
CN110135366B (en) Shielded pedestrian re-identification method based on multi-scale generation countermeasure network
Henry et al. Road segmentation in SAR satellite images with deep fully convolutional neural networks
CN108154192B (en) High-resolution SAR terrain classification method based on multi-scale convolution and feature fusion
CN111582043B (en) High-resolution remote sensing image ground object change detection method based on multitask learning
CN106845529B (en) Image feature identification method based on multi-view convolution neural network
CN113076871B (en) Fish shoal automatic detection method based on target shielding compensation
CN112541904B (en) Unsupervised remote sensing image change detection method, storage medium and computing device
CN110598600A (en) Remote sensing image cloud detection method based on UNET neural network
CN108038435B (en) Feature extraction and target tracking method based on convolutional neural network
CN111652892A (en) Remote sensing image building vector extraction and optimization method based on deep learning
CN110287777B (en) Golden monkey body segmentation algorithm in natural scene
WO2020062360A1 (en) Image fusion classification method and apparatus
CN110458192B (en) Hyperspectral remote sensing image classification method and system based on visual saliency
CN112233129B (en) Deep learning-based parallel multi-scale attention mechanism semantic segmentation method and device
CN113269224B (en) Scene image classification method, system and storage medium
CN107506792B (en) Semi-supervised salient object detection method
CN112101364B (en) Semantic segmentation method based on parameter importance increment learning
CN111539314A (en) Cloud and fog shielding-oriented sea surface target significance detection method
CN111179196B (en) Multi-resolution depth network image highlight removing method based on divide-and-conquer
CN110852327A (en) Image processing method, image processing device, electronic equipment and storage medium
Li et al. An aerial image segmentation approach based on enhanced multi-scale convolutional neural network
Zuo et al. A remote sensing image semantic segmentation method by combining deformable convolution with conditional random fields
CN117058546A (en) High-resolution remote sensing image building extraction method of global local detail perception conditional random field

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20230615

Address after: 430074 Hubei Province, Wuhan city Hongshan District Luoyu Road No. 1037

Applicant after: HUAZHONG University OF SCIENCE AND TECHNOLOGY

Address before: Room 305-65, 2-3 / F, 5-15 / F, R & D building / unit 1, modern service industry base, Science Park, Huazhong University of science and technology, 13-1, daxueyuan Road, Donghu New Technology Development Zone, Wuhan City, Hubei Province, 430000

Applicant before: Wuhan shanlai Technology Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant