CN112749578A - Remote sensing image automatic road extraction method based on deep convolutional neural network - Google Patents

Remote sensing image automatic road extraction method based on deep convolutional neural network Download PDF

Info

Publication number
CN112749578A
CN112749578A CN201911039232.7A CN201911039232A CN112749578A CN 112749578 A CN112749578 A CN 112749578A CN 201911039232 A CN201911039232 A CN 201911039232A CN 112749578 A CN112749578 A CN 112749578A
Authority
CN
China
Prior art keywords
decoder
input
remote sensing
sensing image
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911039232.7A
Other languages
Chinese (zh)
Inventor
钱启
周健
张一明
闫灿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Star Map Co ltd
Original Assignee
Zhongke Star Map Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Star Map Co ltd filed Critical Zhongke Star Map Co ltd
Priority to CN201911039232.7A priority Critical patent/CN112749578A/en
Publication of CN112749578A publication Critical patent/CN112749578A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/182Network patterns, e.g. roads or rivers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention provides a remote sensing image automatic road extraction method based on a deep convolutional neural network, which comprises the steps of collecting remote sensing image data and constructing a training data set; constructing a deep semantic segmentation network model of an encoder-decoder mode as a deep convolutional neural network model; training the deep semantic segmentation network model by using the training data set; and inputting the test data into the trained deep semantic segmentation network model to obtain a road extraction result of the remote sensing image. In this way, the remote sensing image can be segmented in a complex scene, and the road information in the remote sensing image can be accurately extracted.

Description

Remote sensing image automatic road extraction method based on deep convolutional neural network
Technical Field
Embodiments of the present invention relate generally to the field of deep learning, computer vision, and remote sensing satellite image processing, and more particularly, to a remote sensing image automatic road extraction method based on a deep convolutional neural network.
Background
Road information plays a very important role in modern society, and the research on a road extraction method based on remote sensing images has very important significance. The remote sensing image road extraction has a plurality of application scenes, such as automatic driving and vehicle navigation, map generation, city planning, intelligent transportation, land utilization detection and the like. The accuracy of road extraction not only significantly influences the recognition effect of other ground objects such as vehicles and buildings, but also is one of the key technologies in the research fields such as natural disaster early warning, military striking, unmanned vehicle path planning and the like.
The remote sensing image road extraction can be divided into semi-automatic extraction and automatic extraction according to the automation degree of the algorithm. The semi-automatic extraction algorithm needs to manually set the starting point of the road or the designated seed point, and the automatic extraction algorithm automatically realizes the extraction of the road by a computer without adding artificial subjective judgment factors.
With the development of artificial intelligence, when the remote sensing image is interpreted, the functions of learning and processing images of a computer are fully utilized, a large amount of manpower and time can be reduced, and the working efficiency is greatly improved. Therefore, such methods of automatically extracting road information from remote sensing images are becoming mainstream research directions. The existing road extraction methods are divided into three types, namely pixel-based, object-oriented and deep learning-based methods. The road extraction method based on deep learning has the best effect.
The road segmentation from the satellite image is a very challenging task, which is unique and difficult compared to the general segmentation task, and is expressed as follows: in the satellite image, the proportion of the target road occupied by the target road is generally small, rivers, railways and the like are too similar to the road, and even the target road is difficult to distinguish by human eyes. The road bifurcation connection condition is complex, and the identification precision of road extraction is quite high. In satellite images, roads are often narrow and have a priori connectivity, several roads may be in cross-connection with each other, and the whole span covers the whole picture. In addition, the problem of road identification connectivity is more prominent, and more roads are broken, and these challenges bring certain difficulty to road extraction in the satellite image, and the traditional image segmentation method is difficult to adapt to these challenges.
Disclosure of Invention
According to the embodiment of the invention, a remote sensing image automatic road extraction scheme based on a deep convolutional neural network is provided.
In a first aspect of the invention, a remote sensing image automatic road extraction method based on a deep convolutional neural network is provided. The method comprises the following steps:
a remote sensing image automatic road extraction method based on a deep convolutional neural network comprises the following steps:
collecting remote sensing image data and constructing a training data set;
constructing a deep semantic segmentation network model of an encoder-decoder mode as a deep convolutional neural network model;
training the deep semantic segmentation network model by using the training data set;
and inputting the test data into the trained deep semantic segmentation network model to obtain a road extraction result of the remote sensing image.
And further, before a training data set is constructed, data enhancement is carried out on the acquired remote sensing image data, and the remote sensing image data after data enhancement is obtained and used as the training data set.
Further, the construction process of the deep semantic segmentation network model of the encoder-decoder mode comprises the following steps:
sequentially constructing an encoder and a decoder to complete the construction of a deep semantic segmentation network model of an encoder-decoder mode; wherein
The process of constructing the encoder includes:
inputting an input image into a basic classification network module pre-trained on ImageNet, inputting an output result of the input image into a cavity rolling module group, outputting a feature extraction result by the cavity rolling module group, cascading the feature extraction results of the cavity rolling module group, performing feature fusion by convolution of 1 x 1, outputting a feature map, and completing encoder construction; the output characteristic diagram of the encoder is used as the first input of the decoder, and the first bottom layer characteristic diagram output by the basic classification network module is used as the second input of the decoder; taking the second bottom layer feature map output by the basic classification network module as a third input of the decoder;
the process of constructing a decoder includes: the first decoder and the second decoder are constructed in sequence.
Further, the first decoder construction process comprises:
performing first up-sampling on a first input of a decoder, then performing cascade connection on the first input of the decoder and a second input of the decoder, then performing continuous convolution twice to obtain a first characteristic diagram, and performing second up-sampling on the first characteristic diagram to obtain a first decoding characteristic diagram;
the second decoder construction process comprises:
and cascading a third input of the decoder with the first decoding feature map, then performing continuous convolution twice to obtain a second decoding feature map, performing second up-sampling on the second decoding feature map to obtain a result map of the deep semantic segmentation network model, and completing construction of the deep semantic segmentation network model.
Further, in the first decoder construction process, before the cascade connection, performing channel dimensionality reduction on the first up-sampled decoder first input and the second input of the decoder by using 1 × 1 convolution;
in the second decoder construction process, prior to concatenation, a channel dimensionality reduction is performed using a 1 x 1 convolution of a third input of the decoder with the first decoded feature.
Further, the hole convolution module group comprises a plurality of branches, and each branch comprises a hole convolution filter with the same convolution kernel size and different hole rates.
Further, before test data are input into the trained deep semantic segmentation network model, the input picture of the test data is subjected to band overlapping cutting, the cut small images are subjected to data enhancement, the test data subjected to data enhancement are input into the trained deep semantic segmentation network model for road feature extraction and segmentation, a plurality of small image segmentation results are obtained, and then the small images are subjected to average fusion, so that the road extraction result of the remote sensing image is obtained.
And further, post-processing the road extraction result of the remote sensing image, namely performing optimization processing by a closed operation mathematical morphology method of expansion and corrosion to obtain the road extraction result of the optimized remote sensing image.
In a second aspect of the invention, an electronic device is provided. The electronic device includes: a memory having a computer program stored thereon and a processor implementing the method as described above when executing the program.
In a third aspect of the invention, a computer-readable storage medium is provided, on which a computer program is stored which, when being executed by a processor, carries out the method as according to the first aspect of the invention.
It should be understood that the statements herein reciting aspects are not intended to limit the critical or essential features of any embodiment of the invention, nor are they intended to limit the scope of the invention. Other features of the present invention will become apparent from the following description.
The method has a good segmentation effect on the remote sensing image in a complex scene, and greatly improves the working efficiency under the condition of accurately extracting the road.
Drawings
The above and other features, advantages and aspects of various embodiments of the present invention will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. In the drawings, like or similar reference characters designate like or similar elements, and wherein:
FIG. 1 shows a flow chart of a method for automatic road extraction of remote sensing images based on deep convolutional neural network according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a model structure based on a deep convolutional neural network according to an embodiment of the present invention;
FIG. 3 illustrates a block diagram of a void volume module in accordance with an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating a morphological processing method in the field of digital image processing according to an embodiment of the present invention;
FIG. 5 is a graph showing a comparison of experimental segmentation results before and after using a morphological processing method according to an embodiment of the present invention;
FIG. 6 illustrates a block diagram of an exemplary electronic device capable of implementing embodiments of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
In the invention, based on the deep convolutional neural network model, the remote sensing image is segmented under a complex scene, and the road information in the remote sensing image is accurately extracted.
Fig. 1 shows a flowchart of a method for automatically extracting a road from a remote sensing image based on a deep convolutional neural network according to an embodiment of the present invention.
As shown, the method includes:
s101, collecting remote sensing image data and constructing a training data set.
The acquired remote sensing image data comprise roads, and the road scene comprises an urban road and a rural road.
In an embodiment of the invention, after the remote sensing image data is collected, the collected remote sensing image data is subjected to data enhancement to obtain the remote sensing image data subjected to data enhancement as a final training data set.
In the embodiment, when data enhancement is performed, the data sample set is randomly subjected to operations such as 90-degree rotation, 180-degree rotation, 270-degree rotation, horizontal inversion, vertical inversion, gamma transformation, blurring, and gaussian noise enhancement. To amplify more data sets, a specified number of random crop cuts were made. In addition, to eliminate color shift between different images, the images are color-transformed. Data enhancement is mainly in order to strengthen the data type in the training data set, promotes the training effect.
S102, constructing a deep semantic segmentation network model of an encoder-decoder mode as a deep convolutional neural network model.
Fig. 2 is a schematic diagram illustrating a model structure based on a deep convolutional neural network according to an embodiment of the present invention.
The construction process of the deep semantic segmentation network model of the coder-decoder mode comprises the following steps:
sequentially constructing an encoder and a decoder to complete the construction of a deep semantic segmentation network model of an encoder-decoder mode; wherein
The process of constructing the encoder includes:
inputting an input image into a basic classification network module pre-trained on ImageNet, inputting an output result of the input image into a cavity rolling module group, outputting a feature extraction result by the cavity rolling module group, cascading the feature extraction results of the cavity rolling module group, performing feature fusion by convolution of 1 x 1, outputting a feature map, and completing encoder construction; wherein the output feature map of the encoder is used as a first input of the decoder, and a first bottom-layer feature map, for example, 1/4 size, output by the basic classification network module is used as a second input of the decoder; the second underlying feature map, e.g., 1/2 size, output by the underlying classification network module is used as a third input to the decoder.
In one embodiment of the present invention, when constructing the encoder, a base classification network module pre-trained on ImageNet is first selected, and then a hole rolling module group is added on the basis of the base classification network module.
As an embodiment, the basic classification network module takes a ResNet101 network as a network structure.
FIG. 3 shows a block diagram of a hole volume module set according to an embodiment of the invention.
The hole convolution module group comprises a plurality of branches, and each branch comprises a hole convolution filter with the same convolution kernel size and different hole rates.
In this embodiment, the hole convolution module group is mainly composed of 4 branches, and each branch includes a hole convolution filter with a convolution kernel size of 3 × 3 with a certain hole rate.
As an example, the void ratios of these 4 branches are 6, 12, 18 and 24, respectively. Then, the feature extraction results of the 4 branches are cascaded, and then feature fusion is performed through 1 × 1 convolution. Considering that the underlying network used by the present invention is ResNet101, in FIG. 3, the input to the set of hole rolling modules is a feature of block4 in ResNet 101.
In the set of hole convolution modules, there is a very important concept of hole convolution. The hole convolution can change the value of the hole rate to adjust the receptive field of the filter, thereby capturing the multi-scale features. For each location i on the output profile y and a convolution filter w, the hole convolution acts on the input profile x mainly according to equation (1). In equation (1), the void rate r is the step size we sample the input signal, and k is the length of the convolution filter w. It is worth mentioning that when the hole rate r is 1, the hole convolution is degraded to the normal convolution.
y[i]=∑kx[i+r·k]w[k] (1)
The hole rolling module group is added mainly for combining with the basic classification network module, so that the receptive field of a network filter can be increased, the image context information is increased, a multi-scale dense feature extraction image is generated, the target can be conveniently segmented in a multi-scale mode, and the task of semantic segmentation can be better completed.
The process of constructing a decoder includes: the first decoder and the second decoder are constructed in sequence. By combining the first decoding module and the second decoding module, the detail of segmentation can be well improved, and the positioning accuracy of the target boundary is improved.
The first decoder construction process includes:
the first input of the decoder is cascaded with the second input of the decoder after first up-sampling, then two times of continuous convolution are carried out to obtain a first characteristic diagram, and second up-sampling is carried out on the first characteristic diagram to obtain a first decoding characteristic diagram. By the first decoder, the segmentation characteristic map generated by the encoder can be effectively improved, and the refinement of the boundary information is facilitated.
In an embodiment of the present invention, the output signature of the encoder is 4 times bi-linear upsampled and then concatenated with the 1/4 size underlying signature generated by the underlying network in the encoder. Both profiles have the same spatial resolution.
In the first decoder construction process, before cascade connection, the first up-sampled first input of the decoder and the second input of the decoder are subjected to channel dimensionality reduction by using convolution of 1 x 1, so that the bottom-layer features are subjected to dimensionality reduction.
As an embodiment of the present invention, before cascading, first, a 1 × 1 convolution is used to perform channel dimensionality reduction on the bottom layer features, so that the bottom layer features are subjected to dimensionality reduction processing, and are uniform in dimensionality, and then cascading is performed. Two 3 x 3 convolutions were used consecutively after the cascade. By using the convolution of two 3 × 3 continuously, the feature map after the cascade can be effectively improved, so that the segmentation effect is better. And carrying out 2-time bilinear upsampling on the obtained feature map to obtain a final output feature map of the first decoder.
The second decoder construction process includes:
and cascading a third input of the decoder with the first decoding feature map, then performing continuous convolution twice to obtain a second decoding feature map, performing second up-sampling on the second decoding feature map to obtain a result map of the deep semantic segmentation network model, and completing construction of the deep semantic segmentation network model.
In the second decoder construction process, before cascade connection, the third input of the decoder and the first decoding feature are subjected to channel dimensionality reduction by using 1-by-1 convolution, so that the bottom-layer feature is subjected to dimensionality reduction.
In the embodiment of the invention, 1 × 1 convolution is firstly used for channel dimensionality reduction on bottom layer features to enable the bottom layer features to obtain dimensionality reduction processing, dimensionality is unified, then a bottom layer feature map with the size of 1/2 generated by a basic network in an encoder and a feature map generated by a first decoding module are cascaded, two continuous 3 × 3 convolutions are used for generating the feature map, and then 2 times of bilinear upsampling is carried out on the feature map to obtain a final result map of a semantic segmentation network model. The second decoding module is constructed, so that the boundary details of the target can be further recovered, the segmentation details are further improved, and the positioning accuracy of the target boundary is improved.
S103, training the deep semantic segmentation network model by using the training data set.
In an embodiment of the invention, the basic network is adjusted, and the 1000 types of the last layer are changed into 2 types, namely, a road type and a background type. Therefore, the basic network can be more suitable for the binary semantic segmentation task of automatic road extraction.
In one embodiment of the invention, in the aspect of definition of the loss function, the loss function uses cross entropy loss, and the loss function is used as an optimization target of the deep convolutional neural network, so that the semantic segmentation problem in the invention can be effectively defined.
In one embodiment of the invention, a standard SGD optimizer is used to optimally solve the loss function in terms of optimizer selection. The optimizer traverses the data samples in batch when the optimization is carried out, which is helpful for the rapid convergence of the deep convolutional neural network.
In one embodiment of the invention, a poly learning rate strategy is used in the learning rate setting. Compared with a strategy of reducing the learning rate by using a fixed iteration step number, the learning rate strategy is more effective, and the final loss value of the deep convolutional neural network is reduced.
The poly learning rate strategy is mainly to set an initial learning rate first, and the learning rate is dynamically updated according to a formula (2) in the training process.
Figure BDA0002252384840000101
Wherein power is 0.9, iter represents the current iteration number, maxim represents the set total iteration number, and lr is the learning rate.
In an embodiment of the invention, in the training process, the evaluation index of the algorithm performance adopts mIoU. The evaluation index is the most common measurement standard in the semantic segmentation of deep learning, and is also an important index for measuring the image segmentation precision. The index can integrally measure the segmentation effect of the algorithm, and the larger the numerical value is, the better the segmentation algorithm is.
For each picture, the cross-over score definition for the pixel is shown in equation (3).
Figure BDA0002252384840000102
Wherein, for picture i, TPiNumber of pixels, FP, representing correct prediction as a road pixeliIndicating number of pixels incorrectly predicted as road pixels, FNiIndicating the number of pixels that were mispredicted as non-road pixels.
Assuming that there are n pictures, the average cross-over ratio score mlou for all pictures is the final index we want to use for evaluation. A specific formula of the index mlou is shown in formula (4).
Figure 1
And S104, inputting the test data into the trained deep semantic segmentation network model to obtain a road extraction result of the remote sensing image.
Considering that the size of a remote sensing image is generally large, before test data is input into a trained deep semantic segmentation network model, the input picture of the test data is cut in a band overlapping mode, data enhancement is carried out on a small image after cutting, the test data after data enhancement is input into the trained deep semantic segmentation network model for road feature extraction and segmentation, a plurality of small image segmentation results are obtained, and then the small images are subjected to average fusion, so that a road extraction result of the remote sensing image is obtained.
During testing, the method for directly cutting the image and directly inputting the cut small image data into the trained model for road extraction has a better prediction effect, and the accuracy of road extraction is higher.
As an embodiment of the invention, data enhancement in the testing stage specifically comprises horizontal and vertical folding of the image, image rotation at a certain angle and the like, so that the predicted image types are richer, and the prediction effect is improved.
In a preferred embodiment of the invention, the road extraction result of the remote sensing image is post-processed.
Fig. 4 is a schematic diagram illustrating a morphological processing method in the field of digital image processing according to an embodiment of the present invention.
And in the post-processing process, optimization processing is carried out by a closed operation mathematical morphology method of firstly expanding and then corroding, so that a road extraction result of the optimized remote sensing image is obtained.
The basic operations of mathematical morphology include: erosion and dilation, open and close operations.
As an embodiment of the present invention, the present invention uses a closed operation when performing prediction result post-processing. The closed operation is an operation of expanding the image first and then corroding the image, and a schematic diagram of the closed operation is shown in fig. 4. The closed operation can be used to fill small holes in objects, connect neighboring objects, smooth their boundaries, and do not significantly change their area. Considering that the connectivity problem of road identification is more prominent in road extraction and is more broken roads, in order to reduce the broken road condition as much as possible and combine the characteristic of closed operation, the invention adopts the morphological optimization mode of closed operation to solve the problem.
Fig. 5 is a graph showing a comparison of experimental segmentation results before and after using a morphological processing method according to an embodiment of the present invention. Fig. 5(a) is a graph showing the result of the experimental segmentation before the morphological processing method is used, and fig. 5 (b) is a graph showing the result of the experimental segmentation after the morphological processing method is used. Therefore, the method can effectively improve the connectivity problem of road identification and ensure that the connectivity of the road identification is better.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are exemplary embodiments and that the acts and modules illustrated are not necessarily required to practice the invention.
The above is a description of an embodiment of the method, and the following is a further description of the solution of the present invention through an embodiment of the electronic device implementing the above method of the present invention.
FIG. 6 illustrates a block diagram of an exemplary electronic device capable of implementing embodiments of the present invention.
As shown, an electronic device comprises a memory having stored thereon a computer program and a processor implementing the method of fig. 1 when executing the program.
The device 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes in accordance with computer program instructions stored in a Read Only Memory (ROM)602 or loaded from a storage unit 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the device 600 can also be stored. The CPU601, ROM602, and RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
A number of components in the device 600 are connected to the I/O interface 605, including: an input unit 606 such as a keyboard, a mouse, or the like; an output unit 607 such as various types of displays, speakers, and the like; a storage unit 608, such as a magnetic disk, optical disk, or the like; and a communication unit 609 such as a network card, modem, wireless communication transceiver, etc. The communication unit 609 allows the device 600 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The processing unit 601 executes the respective methods and processes described above, for example, the methods S101 to S104. For example, in some embodiments, methods S101-S104 may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 608. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 600 via the ROM602 and/or the communication unit 609. When the computer program is loaded into the RAM603 and executed by the CPU601, one or more steps of the methods S101-S104 described above may be performed. Alternatively, in other embodiments, the CPU601 may be configured to perform the methods S101-S104 by any other suitable means (e.g., by means of firmware).
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a load programmable logic device (CPLD), and the like.
Program code for implementing the methods of the present invention may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
Further, while operations are depicted in a particular order, this should be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the invention. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims (10)

1. A remote sensing image automatic road extraction method based on a deep convolutional neural network is characterized by comprising the following steps:
collecting remote sensing image data and constructing a training data set;
constructing a deep semantic segmentation network model of an encoder-decoder mode as a deep convolutional neural network model;
training the deep semantic segmentation network model by using the training data set;
and inputting the test data into the trained deep semantic segmentation network model to obtain a road extraction result of the remote sensing image.
2. The method according to claim 1, wherein before constructing the training data set, data enhancement is performed on the acquired remote sensing image data to obtain data-enhanced remote sensing image data as the training data set.
3. The method according to claim 1, wherein the construction process of the deep semantic segmentation network model of the encoder-decoder mode comprises:
sequentially constructing an encoder and a decoder to complete the construction of a deep semantic segmentation network model of an encoder-decoder mode; wherein
The process of constructing the encoder includes:
inputting an input image into a basic classification network module pre-trained on ImageNet, inputting an output result of the input image into a cavity rolling module group, outputting a feature extraction result by the cavity rolling module group, cascading the feature extraction results of the cavity rolling module group, performing feature fusion by convolution of 1 x 1, outputting a feature map, and completing encoder construction; the output characteristic diagram of the encoder is used as the first input of the decoder, and the first bottom layer characteristic diagram output by the basic classification network module is used as the second input of the decoder; taking the second bottom layer feature map output by the basic classification network module as a third input of the decoder;
the process of constructing a decoder includes: the first decoder and the second decoder are constructed in sequence.
4. The method of claim 3,
the first decoder construction process comprises:
performing first up-sampling on a first input of a decoder, then performing cascade connection on the first input of the decoder and a second input of the decoder, then performing continuous convolution twice to obtain a first characteristic diagram, and performing second up-sampling on the first characteristic diagram to obtain a first decoding characteristic diagram;
the second decoder construction process comprises:
and cascading a third input of the decoder with the first decoding feature map, then performing continuous convolution twice to obtain a second decoding feature map, performing second up-sampling on the second decoding feature map to obtain a result map of the deep semantic segmentation network model, and completing construction of the deep semantic segmentation network model.
5. The method of claim 4, wherein in the first decoder construction process, prior to the concatenation, a 1 x 1 convolution is used for channel dimensionality reduction on the first up-sampled decoder first input and the decoder second input;
in the second decoder construction process, prior to concatenation, a channel dimensionality reduction is performed using a 1 x 1 convolution of a third input of the decoder with the first decoded feature.
6. The method of claim 3, wherein the set of hole convolution modules comprises a plurality of branches, each branch comprising a hole convolution filter having a convolution kernel of the same size and a different hole rate.
7. The method according to claim 1, wherein before the test data is input into the trained deep semantic segmentation network model, the input picture of the test data is cut with overlap, the cut small images are subjected to data enhancement, the test data after the data enhancement is input into the trained deep semantic segmentation network model for road feature extraction and segmentation, a plurality of small image segmentation results are obtained, and then the small images are subjected to average fusion, so that the road extraction result of the remote sensing image is obtained.
8. The method according to claim 1, characterized in that the road extraction result of the remote sensing image is post-processed, i.e. optimized by a closed-operation mathematical morphology method of expansion followed by corrosion, to obtain the optimized road extraction result of the remote sensing image.
9. An electronic device comprising a memory and a processor, the memory having stored thereon a computer program, wherein the processor, when executing the program, implements the method of any of claims 1-8.
10. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method according to any one of claims 1 to 8.
CN201911039232.7A 2019-10-29 2019-10-29 Remote sensing image automatic road extraction method based on deep convolutional neural network Pending CN112749578A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911039232.7A CN112749578A (en) 2019-10-29 2019-10-29 Remote sensing image automatic road extraction method based on deep convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911039232.7A CN112749578A (en) 2019-10-29 2019-10-29 Remote sensing image automatic road extraction method based on deep convolutional neural network

Publications (1)

Publication Number Publication Date
CN112749578A true CN112749578A (en) 2021-05-04

Family

ID=75640155

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911039232.7A Pending CN112749578A (en) 2019-10-29 2019-10-29 Remote sensing image automatic road extraction method based on deep convolutional neural network

Country Status (1)

Country Link
CN (1) CN112749578A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139617A (en) * 2021-05-10 2021-07-20 郑州大学 Power transmission line autonomous positioning method and device and terminal equipment
CN113326799A (en) * 2021-06-22 2021-08-31 长光卫星技术有限公司 Remote sensing image road extraction method based on EfficientNet network and direction learning
CN114283343A (en) * 2021-12-20 2022-04-05 北京百度网讯科技有限公司 Map updating method, training method and equipment based on remote sensing satellite image
CN116503729A (en) * 2023-03-17 2023-07-28 中国自然资源航空物探遥感中心 Road extraction method and device applied to remote sensing digital image
CN117789042A (en) * 2024-02-28 2024-03-29 中国地质大学(武汉) Road information interpretation method, system and storage medium

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139617A (en) * 2021-05-10 2021-07-20 郑州大学 Power transmission line autonomous positioning method and device and terminal equipment
CN113139617B (en) * 2021-05-10 2023-04-07 郑州大学 Power transmission line autonomous positioning method and device and terminal equipment
CN113326799A (en) * 2021-06-22 2021-08-31 长光卫星技术有限公司 Remote sensing image road extraction method based on EfficientNet network and direction learning
CN114283343A (en) * 2021-12-20 2022-04-05 北京百度网讯科技有限公司 Map updating method, training method and equipment based on remote sensing satellite image
CN114283343B (en) * 2021-12-20 2023-09-26 北京百度网讯科技有限公司 Map updating method, training method and device based on remote sensing satellite image
CN116503729A (en) * 2023-03-17 2023-07-28 中国自然资源航空物探遥感中心 Road extraction method and device applied to remote sensing digital image
CN117789042A (en) * 2024-02-28 2024-03-29 中国地质大学(武汉) Road information interpretation method, system and storage medium
CN117789042B (en) * 2024-02-28 2024-05-14 中国地质大学(武汉) Road information interpretation method, system and storage medium

Similar Documents

Publication Publication Date Title
CN112749578A (en) Remote sensing image automatic road extraction method based on deep convolutional neural network
CN111274865A (en) Remote sensing image cloud detection method and device based on full convolution neural network
CN109003297B (en) Monocular depth estimation method, device, terminal and storage medium
CN108764039B (en) Neural network, building extraction method of remote sensing image, medium and computing equipment
CN107564009B (en) Outdoor scene multi-target segmentation method based on deep convolutional neural network
CN110675339A (en) Image restoration method and system based on edge restoration and content restoration
CN114898352A (en) Method for simultaneously realizing image defogging and license plate detection
CN114494821B (en) Remote sensing image cloud detection method based on feature multi-scale perception and self-adaptive aggregation
CN113569724B (en) Road extraction method and system based on attention mechanism and dilation convolution
CN114155200B (en) Remote sensing image change detection method based on convolutional neural network
CN115546768A (en) Pavement marking identification method and system based on multi-scale mechanism and attention mechanism
CN113658200A (en) Edge perception image semantic segmentation method based on self-adaptive feature fusion
CN115223063A (en) Unmanned aerial vehicle remote sensing wheat new variety lodging area extraction method and system based on deep learning
CN114418987B (en) Retina blood vessel segmentation method and system with multi-stage feature fusion
CN116912257A (en) Concrete pavement crack identification method based on deep learning and storage medium
CN116740362B (en) Attention-based lightweight asymmetric scene semantic segmentation method and system
CN113989287A (en) Urban road remote sensing image segmentation method and device, electronic equipment and storage medium
CN116883679B (en) Ground object target extraction method and device based on deep learning
CN115358962B (en) End-to-end visual odometer method and device
CN116778318A (en) Convolutional neural network remote sensing image road extraction model and method
CN114821651B (en) Pedestrian re-recognition method, system, equipment and computer readable storage medium
CN116310871A (en) Inland water extraction method integrating cavity space pyramid pooling
CN115830054A (en) Crack image segmentation method based on multi-window high-low frequency visual converter
CN115330703A (en) Remote sensing image cloud and cloud shadow detection method based on context information fusion
CN112446292A (en) 2D image salient target detection method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination