CN116935226A - HRNet-based improved remote sensing image road extraction method, system, equipment and medium - Google Patents
HRNet-based improved remote sensing image road extraction method, system, equipment and medium Download PDFInfo
- Publication number
- CN116935226A CN116935226A CN202310959573.6A CN202310959573A CN116935226A CN 116935226 A CN116935226 A CN 116935226A CN 202310959573 A CN202310959573 A CN 202310959573A CN 116935226 A CN116935226 A CN 116935226A
- Authority
- CN
- China
- Prior art keywords
- remote sensing
- hrnet
- sensing image
- network
- road
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 53
- 238000000034 method Methods 0.000 claims abstract description 63
- 238000012549 training Methods 0.000 claims abstract description 59
- 238000011176 pooling Methods 0.000 claims abstract description 28
- 238000012360 testing method Methods 0.000 claims abstract description 20
- 238000012795 verification Methods 0.000 claims abstract description 19
- 230000007246 mechanism Effects 0.000 claims abstract description 17
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 230000008569 process Effects 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 10
- 230000001965 increasing effect Effects 0.000 claims description 9
- 238000010606 normalization Methods 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 7
- 238000012512 characterization method Methods 0.000 claims description 5
- 230000002708 enhancing effect Effects 0.000 claims description 5
- 238000012935 Averaging Methods 0.000 claims description 4
- 230000006870 function Effects 0.000 claims description 4
- 239000000284 extract Substances 0.000 abstract description 6
- 238000010586 diagram Methods 0.000 description 11
- 238000004364 calculation method Methods 0.000 description 10
- 238000013135 deep learning Methods 0.000 description 7
- 238000013527 convolutional neural network Methods 0.000 description 6
- 230000011218 segmentation Effects 0.000 description 6
- 238000011161 development Methods 0.000 description 5
- 230000018109 developmental process Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 230000003321 amplification Effects 0.000 description 3
- 238000013136 deep learning model Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 238000003199 nucleic acid amplification method Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 244000052909 Dioscorea esculenta Species 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000009096 changqing Substances 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 239000011800 void material Substances 0.000 description 1
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Image Analysis (AREA)
Abstract
An improved remote sensing image road extraction method, system, equipment and medium based on HRNet, wherein the method comprises the following steps: preprocessing high-resolution remote sensing image data of the CHN6-CUG data set to obtain a preprocessed remote sensing image data set; constructing a road prediction model based on improved HRNet; setting network training parameters; training the improved HRNet network by using a training set and a verification set in the remote sensing image data set according to the network training parameters; inputting the test set into a trained model for prediction, and outputting a road information binary image obtained by prediction; system, device, and medium: for implementing the above method; the invention adopts a cavity space pyramid pooling, channel attention mechanism and depth separable convolution method to improve the original HRNet network, uses the CHN6-CUG remote sensing image dataset to carry out model training and prediction, effectively extracts the road information in the high-resolution remote sensing image, and has the advantages of strong model adaptability, high road extraction precision and light network.
Description
Technical Field
The invention belongs to the technical field of remote sensing image road extraction, and particularly relates to an improved remote sensing image road extraction method, system, equipment and medium based on HRNet.
Background
Roads are one of the indispensable elements in city development and planning. The road information is an important component for researching and analyzing urban geographic information systems, and plays a wide role in the application fields of vehicle navigation, intelligent traffic, map drawing, urban planning and the like. With the rapid development of remote sensing technology, the spatial resolution of remote sensing images is gradually improved, the decimeter level is reached at present, and the development trend is in an ascending state. The extraction of road information from remote sensing images is a hotspot problem in the fields of urban planning, social development and the like.
Currently, most road extraction studies still use a semi-automatic mode (Zhao Linghu, yuan Xiping, gan Shu, etc. the improved deep labv3+ high resolution remote sensing image road extraction model [ J ]. Natural resource remote sensing, 2023,35 (01): 107-114.), which requires manually given the starting point of the road, and extracts road information and non-road information according to specific rules and logic. The method has the problems of poor algorithm robustness, low recognition precision, complicated flow, long time consumption and the like, and brings a plurality of difficulties to road extraction.
In recent years, deep learning theory mainly represented by convolutional neural networks has been rapidly developed. Meanwhile, the method based on the deep convolutional neural network obtains the most advanced performance in various computer vision tasks, such as scene recognition, object detection and the like. Researchers have also begun to gradually use deep learning to solve the segmentation and interpretation problems of remote sensing data.
With the progress and innovation of remote sensing technology, the resolution of remote sensing space, spectrum, time and the like is continuously improved. The details of ground objects and background information in the remote sensing image are more abundant, non-road information such as vegetation shadows, vehicle flows, high-rise building shielding, people flow and the like are quite complicated as interference information, so that the difficulty in identifying road information is increased, and the problems put forward higher requirements on adaptability, segmentation accuracy and the like of a deep learning image segmentation model.
Beginning around 1970, researchers at home and abroad tried to extract road information in remote sensing images from different angles. There are three main road extraction methods (Shi Wenzhong, zhu Changqing, wang. Methods for extracting road features from remote sensing images reviewed and expected [ J ]. Mapping school report, 2001,30 (3): 258-262.): according to the automation degree of road extraction, the method can be classified into a full-automatic method, a semi-automatic method and a purely manual method; according to the image characteristics of road information, the method can be divided into four characteristic extraction methods of radiation, background, topology and geometry; the elements of road extraction can be classified into road center line extraction and road area extraction. In the research process, scientific researchers firstly use a traditional method to extract roads from remote sensing images. The conventional road extraction method can be roughly divided into three categories of template matching, knowledge driving and object-oriented. The main problems of the traditional road extraction method technology based on computer vision are as follows:
(1) the migration capability of the method is insufficient, and different models need to be matched in different scenes.
(2) The segmentation accuracy is not high, and the segmentation quality depends on the experience of researchers.
(3) The calculation cost is high and manual participation is required.
Most of the traditional road extraction methods need to manually select characteristics, so that the road extraction precision is limited, and the requirements of modern society on high-precision road information extraction are difficult to meet. Meanwhile, the deep learning method is rapidly developed and exhibits excellent performance in the field of computer image processing.
In recent years, convolutional neural networks (Convolutional Neural Networks, CNN) have gained great progress in computer vision tasks due to their efficient feature learning capabilities. Deep learning methods based on convolutional neural networks are continually evolving. In 2019 KeSun et al (Sun K, xiao B, liu D, et al deep high-resolution representation learning for human pose estimation [ C ]// Proceedings of the IEEE/CVF conference on computer vision and pattern recognment.2019:5693-5703.) collectively proposed a HRNet (High ResolutionNet) network. The network was initially applied in the Human Pose Estimation task, relying on high resolution feature representations to achieve better performance. The HRNet network can be applied to various computer vision tasks, including gesture recognition tasks, semantic segmentation tasks, target detection tasks, classification tasks and the like (Sun K, zhao Y, jiang B, et al high-resolution representations for labeling pixels and regions [ J ]. ArXiv preprint arXiv:1904.04514,2019 ]), and has relatively good performance. HRNet maintains high resolution throughout the extraction process by gradually adding high-to-low resolution sub-networks and continuously exchanging information with multiple sub-networks to fuse with multi-scale features. The network characteristic representation capability is strong, and the obtained result has higher spatial precision. However, the network repeatedly fuses different resolution sub-networks while maintaining a high resolution profile, which results in high computational complexity and a large number of parameters. This determines that HRNet is difficult to build deeper network structures, thus making it difficult to achieve higher accuracy computing results.
Defects and deficiencies of the prior art:
1. in a dense road city, the road structure is intricate. The existing model has the problem of insufficient adaptability when dividing various road structures, and has poor dividing effect on a multi-scale road target.
2. In the high-resolution remote sensing image, on one hand, the surrounding environment of a road may be very similar to the road, and on the other hand, road components in the image may be blocked by surrounding obstacles, so that the deep learning model cannot accurately extract road information in the image, and the model extraction precision is limited.
Hrnet continuously exchanges information and fuses features of the parallel sub-networks, and a large number of repeated calculation exists in the process, so that a large number of parameters are generated in model training, and the model calculation complexity is improved.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide an improved remote sensing image road extraction method, system, equipment and medium based on HRNet, which are based on the HRNet network, improve the original network by adopting a hole space pyramid pooling, channel attention mechanism and depth separable convolution method, use a CHN6-CUG remote sensing image dataset to carry out model training and prediction, effectively extract road information in a high-resolution remote sensing image, provide important help for intelligent city planning and construction, and have the advantages of strong model adaptability, high road extraction precision and light network.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
an improved remote sensing image road extraction method based on HRNet comprises the following steps:
step 1, preprocessing high-resolution remote sensing image data in a CHN6-CUG data set to obtain a preprocessed remote sensing image data set, wherein the remote sensing image data set comprises a training set, a verification set and a test set;
step 2, constructing a road prediction model based on improved HRNet;
step 3, setting network training parameters, wherein the training parameters comprise training generation, batch size and learning rate;
step 4, training the improved HRNet network constructed in the step 2 by using the training set and the verification set in the remote sensing image data set obtained in the step 1 according to the network training parameters set in the step 3; the training network model comprises a training set and a verification set in a CHN6-CUG data set, and a network model is loaded, wherein the network training adopts DiceLoss and CrossEntropyLoss, and an average value is taken as a loss function;
step 5, road prediction: and (3) inputting the test set obtained in the step (1) into the model trained in the step (4) for prediction, and outputting a road information binary image obtained by prediction.
The specific implementation method of the step 1 comprises the following substeps:
step 101: amplifying high-resolution remote sensing image data in the CHN6-CUG data set;
step 102: dividing the CHN6-CUG data set amplified in the step 101 into a training set, a verification set and a test set according to a certain proportion;
step 103: performing data standardization and normalization on the training set, the verification set and the test set obtained in the step 102; and adjusting the pixel value of the high-resolution remote sensing image data to a similar range, and realizing data de-averaging centering.
The specific implementation method of the step 2 comprises the following substeps:
step 201: adopting a cavity space pyramid pooling module to process two feature images output in a first stage in the HRNet network, expanding a feature image receptive field, and increasing multi-scale road feature information contained in the feature images;
step 202: calculating weights of different channels of the feature map output by the cavity space pyramid pooling module by adopting a channel attention mechanism, combining the weights with the feature map, and enhancing the representation of important feature information in the feature map;
step 203: and (3) improving a residual structure in the HRNet network by adopting a depth separable convolution method, and replacing one layer of 3x3 common convolution in the residual structure with the 3x3 depth separable convolution.
The invention also provides an improved remote sensing image road extraction system based on HRNet, which comprises:
and a cavity space pyramid pooling module: the method comprises the steps of processing two feature images output in a first stage in an HRNet network, expanding a feature image receptive field, and increasing multi-scale road feature information contained in the feature images;
channel attention mechanism module: the method comprises the steps of calculating weights of different channels of a feature map output by a cavity space pyramid pooling module, combining the weights with the feature map, and enhancing characterization of important feature information in the feature map;
and (3) a light weight module: and (3) improving a residual structure in the HRNet network by adopting a depth separable convolution method, and replacing one layer of 3x3 common convolution in the residual structure with the 3x3 depth separable convolution.
The invention also provides an improved remote sensing image road extraction device based on HRNet, which comprises:
a memory: storing a computer program of the improved remote sensing image road extraction method based on HRNet, which is equipment readable by a computer;
a processor: the improved remote sensing image road extraction method based on the HRNet is realized when the computer program is executed.
The invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program can realize the improved remote sensing image road extraction method based on the HRNet when being executed by a processor.
Compared with the prior art, the invention has the beneficial effects that:
first: and a cavity space pyramid pooling module is introduced into the HRNet, so that the capability of the model for extracting the multi-scale road targets is improved. The cavity space pyramid pooling module can effectively improve the single-point receptive field range of the output characteristic diagram by superposing cavity convolution layers with different expansion rates and combining the average pooling layer and the original characteristic diagram. The expansion rate combination of the cavity space pyramid pooling module is reasonably arranged, so that the grid problem caused by cavity convolution operation can be effectively improved, meanwhile, the receptive field covers the whole bottom layer characteristic diagram, and the model can effectively extract multi-scale road characteristic information.
Second,: and a channel attention mechanism is introduced into the HRNet, so that the problem that the characteristics of a tiny road in a model are not outstanding is solved, and the characteristic extraction quality of the model is improved. The importance of different channels of the feature map is not the same, and the original HRNet network does not consider the problem. And the interdependence relationship among the channels of the feature map is established by adopting a channel attention mechanism, so that the important feature characterization in the feature map is enhanced, and the feature extraction effect of the model is improved.
Third,: and realizing the light weight of the HRNet network by using a depth separable convolution method. The HRNet network adopts a parallel structure to continuously exchange information and fuse characteristics of the sub-network, so that huge parameters are generated in the calculation process, and the calculation complexity is improved. The common convolution in the HRNet residual error module is improved by adopting the depth separable convolution, so that the model parameter number can be effectively reduced, the model training speed is improved, and the model precision is not obviously influenced.
In summary, the invention improves the original network based on the HRNet network by adopting a hole space pyramid pooling, channel attention mechanism and depth separable convolution method, uses the CHN6-CUG remote sensing image dataset to carry out model training and prediction, effectively extracts the road information in the high-resolution remote sensing image, and has the advantages of strong model adaptability, high road extraction precision and light network.
Drawings
FIG. 1 is a schematic flow chart of an embodiment of the present invention.
Fig. 2 is a network structure improvement diagram according to an embodiment of the present invention.
FIG. 3 is a graph of a combination hole convolution with different expansion rates according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of a hole space pyramid pooling structure according to an embodiment of the present invention.
FIG. 5 is a channel attention mechanism diagram of an embodiment of the present invention.
FIG. 6 is a diagram of a depth separable convolution method in accordance with an embodiment of the present disclosure; fig. 6 (a) is a schematic diagram of an original HRNet residual module, and fig. 6 (b) is a schematic diagram of an HRNet residual module modified by a depth separable convolution method according to an embodiment of the present invention.
Detailed Description
The technical scheme of the invention is further explained below with reference to the accompanying drawings.
The road information is an important component for researching and analyzing urban geographic information systems, and the road information extraction model based on the deep learning method has important significance for acquiring mass road information at present. With the continuous development of remote sensing technology, the spatial resolution of the remote sensing image is remarkably improved. The detail information in the high-resolution remote sensing image is rich, high-precision road information can be provided, and the high-resolution remote sensing image is a good research object for extracting the road information. The present embodiment uses the CHN6-CUG remote sensing road dataset (http:// grzy. Cuu. Edu. CN/zhuqiqi/zh_CN/yjgk/32368/list/index. Htm). The data set is preprocessed prior to training. Through the pyramid pooling treatment of the cavity space, the feature map receptive field can be effectively enlarged, and the multi-scale road information extracting capability of the model is improved. The importance of different channels of the feature map is different, and the dependency relationship among the channels of the feature map is constructed by introducing a channel attention mechanism, so that the quality of network feature characterization can be enhanced, and important features are highlighted. In addition, the network residual error module is improved by adopting the depth separable convolution method, so that the model parameter number can be effectively reduced, and the model training speed can be improved.
The invention provides an improved remote sensing image road extraction method based on HRNet, which comprises the following steps:
step 1, preprocessing high-resolution remote sensing image data in a CHN6-CUG data set to obtain a preprocessed remote sensing image data set, wherein the remote sensing image data set comprises a training set, a verification set and a test set;
the specific implementation method of the step 1 comprises the following substeps:
step 101: amplifying high-resolution remote sensing image data in the CHN6-CUG data set; because the effective characteristic information of a single sample image is limited, a large number of samples are needed to train the model, so that the model learns more deep characteristic information. And proper data amplification is carried out on the determined data set, so that the complexity of a sample can be effectively improved, and the performance and generalization capability of a model are improved. The invention expands the CHN6-CUG data set by adopting 90 DEG, 180 DEG, 270 DEG and left and right mirror image operation respectively;
step 102: dividing the CHN6-CUG data set amplified in the step 101 into a training set, a verification set and a test set according to a certain proportion;
step 103: performing data standardization and normalization on the training set, the verification set and the test set obtained in the step 102; before the image data is input into the HRNet network for training, normalization and standardization are carried out, and the pixel values of the high-resolution remote sensing image data are adjusted to be in a similar range, so that the data is subjected to mean value removal and centering.
Step 2, constructing a road prediction model based on improved HRNet;
the specific implementation method of the step 2 comprises the following substeps:
step 201: adopting a cavity space pyramid pooling module to process two feature images output in a first stage in the HRNet network, expanding a feature image receptive field, and increasing multi-scale road feature information contained in the feature images;
step 202: calculating weights of different channels of the pyramid pooling output feature map of the cavity space by adopting a channel attention mechanism, and combining the weights with the feature map to strengthen important features contained in the feature map;
step 203: and (3) improving a residual structure in the HRNet network by adopting a depth separable convolution method, and replacing one layer of 3x3 common convolution in the residual structure with the 3x3 depth separable convolution. The parameter quantity generated in the model training process is reduced, the calculation complexity is reduced, and the model weight is realized.
Step 3, setting network training parameters, wherein the training parameters comprise training generation, batch size and learning rate;
step 4, training the improved HRNet network constructed in the step 2 by using the training set and the verification set in the remote sensing image data set obtained in the step 1 according to the network training parameters set in the step 3; the training network model comprises a training set and a verification set in a CHN6-CUG data set, and a network model is loaded, wherein the network training adopts DiceLoss and CrossEntropyLoss, and an average value is taken as a loss function;
step 5, road prediction: and (3) inputting the test set obtained in the step (1) into the model trained in the step (4) for prediction, and outputting a road information binary image obtained by prediction.
The invention also provides an improved remote sensing image road extraction system based on HRNet, which comprises:
and a cavity space pyramid pooling module: in step 201, two feature maps output in the first stage of the HRNet network are processed, the feature map receptive field is enlarged, and multi-scale road feature information contained in the feature maps is increased;
channel attention mechanism module: in step 202, the method is used for calculating weights of different channels of the feature map output by the hole space pyramid pooling module, combining the weights with the feature map, and enhancing characterization of important feature information in the feature map;
and (3) a light weight module: in step 203, a depth separable convolution method is adopted to improve a residual structure in the HRNet network, and one layer of 3x3 normal convolution in the residual structure is replaced by 3x3 depth separable convolution.
Referring to fig. 1, a flow chart of a remote sensing image road extraction method based on hole space pyramid pooling and depth separable convolution according to an embodiment of the present invention includes three parts, namely: preprocessing remote sensing image data, constructing and training an improved network model, and testing and verifying the model. Wherein the improved HRNet network structure is shown in fig. 2.
The first part is to preprocess the high-resolution remote sensing image data set, so as to improve the usability of the data. In order to increase the effective characteristic information contained in the data set, the number of samples is increased, and the data amplification is carried out on the data set. According to 6:2: the ratio of 2 divides the data set into a training set, a verification set and a test set. And carrying out normalization and standardization treatment on the obtained new data set to realize data de-averaging centering and finish data preprocessing work.
The second part is to improve the HRNet network, and the multi-scale road characteristic information in the data is extracted by using a hole space pyramid pooling module; introducing a channel attention mechanism to improve the model feature extraction quality; and the network residual error module is improved by using a depth separable convolution method, so that the parameter quantity generated in the model training process is reduced.
And the third part is to input the test set into the trained network model for road prediction to obtain a prediction result.
The following details the above steps respectively:
1. remote sensing data preprocessing
On the one hand, because the effective characteristic information of a single sample image is limited, a large number of samples are required to be used for training the model, so that the deep learning model can obtain more deep features, a better extraction effect is achieved, and fine semantic segmentation is realized. The implementation of the deep learning algorithm depends on iterative calculation of a large number of data samples, and the situation that the number of the samples is insufficient exists in the experiment, so that a certain method is needed to amplify the data set. Research shows that the complexity of the data sample can be effectively improved by carrying out proper data sample amplification on the determined data set, the performance of the model obtained by training is improved, and the generalization capability of the learned features is stronger. The remote sensing image data adopted in the embodiment of the invention is a high-resolution remote sensing image after orthographic projection, the angle characteristic change of the image is not obvious, the complexity of the data can be increased through the operations of overturning and mirroring, and the data set is expanded by adopting the operations of overturning 90 DEG, 180 DEG, 270 DEG and left and right mirroring respectively. On the other hand, the remote sensing image data is normalized and standardized. The data normalization process adjusts the characteristic value of the input data to a similar range, and avoids the obtained gradient value from being too large or too small, so that model training can be converged stably. Through image standardization, data de-averaging centralization can be realized. The data centralization enables the input data to accord with the distribution rule, so that the generalization effect after training is easier to obtain, and the training and fitting processes are accelerated. After the data image is subjected to standardized processing, the background component in the data, namely the common modulus in the image is weakened, and the image characteristics in each image can be more obviously displayed, so that the optimization and the improvement of the deep learning model performance are facilitated.
Wherein, the normalization calculation is as follows:
the normalization was calculated as follows:
wherein output and input are respectively an output pixel value and an input pixel value, and std (), mean (), max () and min () are respectively: taking standard deviation, taking average value, taking maximum value and taking minimum value.
2. Construction and training of improved network models
The road prediction model is an HRNet network model improved by adding a cavity space pyramid pooling module, a channel attention mechanism and a depth separable convolution method.
As shown in fig. 3, the invention selects a suitable expansion ratio combination to be applied to the cavity space pyramid pooling module according to the HDC principle. Although the hole convolution can increase the receptive field range of the feature map, there is a "grid problem". The problem of data loss occurs when the cavity convolution unsuitable for the expansion rate is overlapped, namely, the obtained feature map receptive field cannot completely cover each pixel of the bottom layer feature map, and partial information is lost. The HDC principle is suitable for the design of a mixed expansion convolution frame, can effectively solve the grid problem, and mainly comprises three aspects.
Consider N hole convolution modules of size KxK, each of which corresponds to an expansion rate of [ r ] 1 ,…,r i ,…,r n ]The HDC principle aims to enable the receptive field range of the feature map obtained by superposition after the convolution treatment of a plurality of holes to cover the whole area of the bottom feature layer, and no holes and missing edges exist in the area.
The maximum distance between two non-zero points is defined as:
M i =max{M i+1 -2r i ,2r i -M i+1 ,r i } (3)
wherein M is n =r n The goal of HDC design is to let M 2 K, i.e., the maximum distance between two non-zero elements of the second layer is less than or equal to the size of the convolution kernel of that layer. HDC also requires that the convolutions within a group should not have a fixed transformation factor, i.e. that the selected expansion ratio does not have a common divisor of more than 1. When the void convolution expansion rate is properly selected, the resulting convolution receptive field is shown in the lower part of fig. 3.
As shown in fig. 4, the hole space pyramid pooling module provided by the invention includes 6 branches in total, wherein the first layer is an original feature image and is used for supplementing space information in the feature image. The last layer is the average pooling layer, which functions to obtain global features at the image level. The second layer to the fifth layer are cavity convolution layers with different expansion rates, and are used for expanding the receptive field of the feature map and comprehensively extracting the multi-scale road feature information. The expansion rates of the 4 branches are selected according to the HDC criterion, and the expansion rates are respectively 2, 3, 7 and 13. There is no common divisor greater than 1 between the set of expansion ratios, and:
M 2 =max{M 3 -2r 2 ,2r 2 -M 3 ,r 2 }=3≤3 (4)
the group of expansion rates meet the HDC criterion, and the fact that the receptive field of the cavity convolution pyramid formed by 4 expansion rates can cover the whole area of the bottom layer characteristic diagram is proved. The method effectively improves the grid problem and enhances the capability of the HRNet network for extracting the multi-scale road information.
As shown in fig. 5, the present invention introduces a channel attention mechanism to improve network feature extraction quality. The importance of different channels of the feature map is different, the problem is not considered in the original HRNet network, and the output feature images are directly fused in series after up-sampling or down-sampling, which is not beneficial to the recovery of the pixel level features. The channel attention mechanism considers the importance difference among different channels of the feature map, and the proportion of the different channels is determined through a learned set of weights, which is equivalent to recalibrating the original features by using the weights, the features with high importance are enhanced, the features with low importance are attenuated, and finally the effect of feature extraction is improved.
The process is divided into Squeeze, excitation, scale steps. Wherein Squeeze (F) sq ) The two-dimensional image characteristic H multiplied by W of each channel is compressed into a number by adopting global average pooling operation, and the characteristic diagram is realized from [ H, W, c ]]Conversion to [1, c]. Second part specification (F) ex ) And generating a weight value for each characteristic channel, and constructing the inter-channel correlation by adopting two full-connection layer processing, wherein the number of the weight values output by the method is consistent with the number of the channels of the input characteristic graph. Third section Scale (F) scale ) Weighting the normalized weight value obtained by the previous processing toAnd multiplying each channel by a corresponding weight coefficient on the channel corresponding to the original feature map to finally obtain an output feature image.
As shown in fig. 6, the present invention improves the network residual module by introducing a depth separable convolution method in order to reduce the huge parameter amount generated by using the residual module in a large amount in the network. The depth separable convolution method replaces a 3×3 normal convolution in the residual block with a 3×3 depth separable convolution, which includes a layer of 3×3 channel-by-channel convolution and a layer of 1×1 point-by-point convolution. Each convolution kernel used in the channel-by-channel convolution is only responsible for calculation of a single channel, and feature maps with the same number of channels as the original feature map are output after one convolution operation. Because the channel-by-channel convolution operation is to perform independent operation on the channels of the feature images, the association of the feature information contained in the same spatial position of different channels is not fully considered, so that 1X 1 point-by-point convolution is added after the channel-by-channel convolution operation, the feature images after the channel-by-channel convolution operation are weighted in the depth direction on one hand, and on the other hand, the channel number of the output feature images can be changed and adjusted to form a new feature image. The 3×3 depth separable convolution is used for replacing the original 3×3 common convolution, so that the number of parameters generated in the calculation process of the HRNet network can be effectively reduced, the calculation complexity of a model is reduced, and the training speed of the network is accelerated.
3. Model testing and verification
Inputting the test set into the trained network model, extracting road information contained in the high-resolution remote sensing image in the test set, and obtaining a road information binary image. And outputting and storing the prediction result, comparing the prediction result with the sample label, and calculating a model evaluation index.
Model evaluation indexes include Precision (Precision), accuracy (Accuracy), recall (Recall), F1 score, and homozygote ratio (mIOU).
The model evaluation results were as follows:
in the actual application scene, the acquired high-resolution remote sensing image data is processed according to the data preprocessing and feature extraction part, so that a data set verification set and a test set do not need to be distinguished. And then loading a road extraction model, inputting the processed data into the road extraction model, and obtaining a road extraction result after model processing. In particular, the method according to the technical solution of the present invention may be implemented by those skilled in the art using computer software technology to implement an automatic operation flow, and a system apparatus for operating the method, such as a computer readable storage medium storing a corresponding computer program according to the technical solution of the present invention and a computer device including the corresponding computer program for operating the corresponding computer program, should also fall within the protection scope of the present invention.
Claims (6)
1. An improved remote sensing image road extraction method based on HRNet is characterized in that: the method comprises the following steps:
step 1, preprocessing high-resolution remote sensing image data in a CHN6-CUG data set to obtain a preprocessed remote sensing image data set, wherein the remote sensing image data set comprises a training set, a verification set and a test set;
step 2, constructing a road prediction model based on improved HRNet;
step 3, setting network training parameters, wherein the training parameters comprise training generation, batch size and learning rate;
step 4, training the improved HRNet network constructed in the step 2 by using the training set and the verification set in the remote sensing image data set obtained in the step 1 according to the network training parameters set in the step 3; the training network model comprises a training set and a verification set in a CHN6-CUG data set, and a network model is loaded, wherein the network training adopts DiceLoss and CrossEntropyLoss, and an average value is taken as a loss function;
step 5, road prediction: and (3) inputting the test set obtained in the step (1) into the model trained in the step (4) for prediction, and outputting a road information binary image obtained by prediction.
2. The HRNet-based improved remote sensing image road extraction method as defined in claim 1, wherein the method comprises the steps of: the specific implementation method of the step 1 comprises the following substeps:
step 101: amplifying high-resolution remote sensing image data in the CHN6-CUG data set;
step 102: dividing the CHN6-CUG data set amplified in the step 101 into a training set, a verification set and a test set according to a certain proportion;
step 103: performing data standardization and normalization on the training set, the verification set and the test set obtained in the step 102; and adjusting the pixel value of the high-resolution remote sensing image data to a similar range, and realizing data de-averaging centering.
3. The HRNet-based improved remote sensing image road extraction method as defined in claim 1, wherein the method comprises the steps of: the specific implementation method of the step 2 comprises the following substeps:
step 201: adopting a cavity space pyramid pooling module to process two feature images output in a first stage in the HRNet network, expanding a feature image receptive field, and increasing multi-scale road feature information contained in the feature images;
step 202: calculating weights of different channels of the feature map output by the cavity space pyramid pooling module by adopting a channel attention mechanism, combining the weights with the feature map, and enhancing the representation of important feature information in the feature map;
step 203: and (3) improving a residual structure in the HRNet network by adopting a depth separable convolution method, and replacing one layer of 3x3 common convolution in the residual structure with the 3x3 depth separable convolution.
4. An improved generation remote sensing image road extraction system based on HRNet, its characterized in that: comprising the following steps:
and a cavity space pyramid pooling module: the method comprises the steps of processing two feature images output in a first stage in an HRNet network, expanding a feature image receptive field, and increasing multi-scale road feature information contained in the feature images;
channel attention mechanism module: the method comprises the steps of calculating weights of different channels of a feature map output by a cavity space pyramid pooling module, combining the weights with the feature map, and enhancing characterization of important feature information in the feature map;
and (3) a light weight module: and (3) improving a residual structure in the HRNet network by adopting a depth separable convolution method, and replacing one layer of 3x3 common convolution in the residual structure with the 3x3 depth separable convolution.
5. An improved generation remote sensing image road draws equipment based on HRNet, its characterized in that: comprising the following steps:
a memory: a computer program storing an improved remote sensing image road extraction method based on HRNet as claimed in any one of claims 1-3, as a computer readable device;
a processor: an improved remote sensing image road extraction method based on HRNet according to any one of claims 1-3 when used for executing said computer program.
6. A computer-readable storage medium, characterized by: the computer readable storage medium stores a computer program which, when executed by a processor, enables an improved remote sensing image road extraction method based on HRNet according to any one of claims 1-3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310959573.6A CN116935226A (en) | 2023-08-01 | 2023-08-01 | HRNet-based improved remote sensing image road extraction method, system, equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310959573.6A CN116935226A (en) | 2023-08-01 | 2023-08-01 | HRNet-based improved remote sensing image road extraction method, system, equipment and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116935226A true CN116935226A (en) | 2023-10-24 |
Family
ID=88389673
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310959573.6A Pending CN116935226A (en) | 2023-08-01 | 2023-08-01 | HRNet-based improved remote sensing image road extraction method, system, equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116935226A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117423021A (en) * | 2023-12-19 | 2024-01-19 | 广东海洋大学 | Method for identifying damaged mangrove images of unmanned aerial vehicle |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113436210A (en) * | 2021-06-24 | 2021-09-24 | 河海大学 | Road image segmentation method fusing context progressive sampling |
CN114882380A (en) * | 2022-07-08 | 2022-08-09 | 山东省国土测绘院 | Wetland resource remote sensing identification algorithm based on improved hrnet model |
CN114943902A (en) * | 2022-03-30 | 2022-08-26 | 安徽大学 | Urban vegetation unmanned aerial vehicle remote sensing classification method based on multi-scale feature perception network |
CN115439751A (en) * | 2022-09-22 | 2022-12-06 | 桂林理工大学 | Multi-attention-fused high-resolution remote sensing image road extraction method |
CN116310871A (en) * | 2023-03-20 | 2023-06-23 | 中国人民解放军战略支援部队信息工程大学 | Inland water extraction method integrating cavity space pyramid pooling |
CN116434278A (en) * | 2023-04-17 | 2023-07-14 | 广东工业大学 | Training method of human body posture estimation model |
-
2023
- 2023-08-01 CN CN202310959573.6A patent/CN116935226A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113436210A (en) * | 2021-06-24 | 2021-09-24 | 河海大学 | Road image segmentation method fusing context progressive sampling |
CN114943902A (en) * | 2022-03-30 | 2022-08-26 | 安徽大学 | Urban vegetation unmanned aerial vehicle remote sensing classification method based on multi-scale feature perception network |
CN114882380A (en) * | 2022-07-08 | 2022-08-09 | 山东省国土测绘院 | Wetland resource remote sensing identification algorithm based on improved hrnet model |
CN115439751A (en) * | 2022-09-22 | 2022-12-06 | 桂林理工大学 | Multi-attention-fused high-resolution remote sensing image road extraction method |
CN116310871A (en) * | 2023-03-20 | 2023-06-23 | 中国人民解放军战略支援部队信息工程大学 | Inland water extraction method integrating cavity space pyramid pooling |
CN116434278A (en) * | 2023-04-17 | 2023-07-14 | 广东工业大学 | Training method of human body posture estimation model |
Non-Patent Citations (2)
Title |
---|
史健锋等: "结合ASPP与改进HRNet的多尺度图像语义分割方法研究", 液晶与显示, vol. 36, no. 11, 13 November 2021 (2021-11-13) * |
陈雪梅等: "基于HRNet的高分辨率遥感影像道路提取方法", 系统工程与电子技术, 22 May 2023 (2023-05-22) * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117423021A (en) * | 2023-12-19 | 2024-01-19 | 广东海洋大学 | Method for identifying damaged mangrove images of unmanned aerial vehicle |
CN117423021B (en) * | 2023-12-19 | 2024-02-23 | 广东海洋大学 | Method for identifying damaged mangrove images of unmanned aerial vehicle |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111178316B (en) | High-resolution remote sensing image land coverage classification method | |
CN112070729B (en) | Anchor-free remote sensing image target detection method and system based on scene enhancement | |
CN106709486A (en) | Automatic license plate identification method based on deep convolutional neural network | |
CN111046917B (en) | Object-based enhanced target detection method based on deep neural network | |
CN111985376A (en) | Remote sensing image ship contour extraction method based on deep learning | |
CN106295613A (en) | A kind of unmanned plane target localization method and system | |
CN114155481A (en) | Method and device for recognizing unstructured field road scene based on semantic segmentation | |
CN113449735B (en) | Semantic segmentation method and device for super-pixel segmentation | |
CN111832615A (en) | Sample expansion method and system based on foreground and background feature fusion | |
CN112347970A (en) | Remote sensing image ground object identification method based on graph convolution neural network | |
CN114973011A (en) | High-resolution remote sensing image building extraction method based on deep learning | |
CN111882620A (en) | Road drivable area segmentation method based on multi-scale information | |
CN116935226A (en) | HRNet-based improved remote sensing image road extraction method, system, equipment and medium | |
CN113033432A (en) | Remote sensing image residential area extraction method based on progressive supervision | |
CN113989261A (en) | Unmanned aerial vehicle visual angle infrared image photovoltaic panel boundary segmentation method based on Unet improvement | |
CN114120125A (en) | Farmland identification method and device based on double-current deep network and storage medium | |
CN115937774A (en) | Security inspection contraband detection method based on feature fusion and semantic interaction | |
Zhu et al. | Change detection based on the combination of improved SegNet neural network and morphology | |
CN114359873A (en) | Weak supervision vehicle feasible region segmentation method integrating road space prior and region level characteristics | |
CN114463492A (en) | Adaptive channel attention three-dimensional reconstruction method based on deep learning | |
Gao et al. | Traffic sign detection based on ssd | |
CN112241676A (en) | Method for automatically identifying terrain sundries | |
CN111222534A (en) | Single-shot multi-frame detector optimization method based on bidirectional feature fusion and more balanced L1 loss | |
CN117710841A (en) | Small target detection method and device for aerial image of unmanned aerial vehicle | |
CN111507276B (en) | Construction site safety helmet detection method based on hidden layer enhanced features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |