CN112699835A

CN112699835A - Road extraction method, device and equipment based on reconstruction bias U-Net and storage medium

Info

Publication number: CN112699835A
Application number: CN202110038614.9A
Authority: CN
Inventors: 陈子仪; 杜吉祥; 范文涛
Original assignee: Huaqiao University
Current assignee: Huaqiao University
Priority date: 2021-01-12
Filing date: 2021-01-12
Publication date: 2021-04-23
Anticipated expiration: 2041-01-12
Also published as: CN112699835B

Abstract

The invention provides a road extraction method, a device, equipment and a storage medium based on reconstruction bias U-Net, wherein the method comprises the following steps: acquiring a training set and a verification set; constructing a U-Net neural network model, wherein the U-Net neural network model comprises an encoder and a decoder, the encoder comprises five convolution modules, and the decoder comprises four up-sampling reconstruction layers; inputting the training set into a U-Net neural network model for training, and storing parameters of the network model to obtain an initial U-Net neural network model; and inputting the verification set into an initial U-Net neural network model and training by combining a loss function until the model converges to obtain a trained U-Net neural network model. And inputting the image to be detected into the trained U-Net neural network model to obtain a road extraction result. The road extraction U-Net neural network model can effectively enhance the reconstruction capability of the network, so that the network has better logical reasoning capability for information such as shielding and the like, and can obtain better segmentation precision and robustness.

Description

Road extraction method, device and equipment based on reconstruction bias U-Net and storage medium

Technical Field

The invention relates to the field of road extraction, in particular to a road extraction method, equipment and a storage medium based on reconstruction bias U-Net.

Background

With the increasing popularity of satellite maps using remote sensing images, the research heat of remote sensing image road extraction is caused. The remote sensing image road extraction based on the depth residual error U-Net is based on an original U-Net structure, a residual error connection module is further added, the training effect of a network is enhanced, and the training convergence of the network is facilitated. As table 1, in particular, the network structure comprises two parts: an encoding section, a bridging section, and a decoding section. In the encoding section, a 224 × 224 × 3 image is used as an original input, and then three levels of convolution modules are connected. Each convolution module is composed of 2 convolution, ReLu and BN operations, and a residual jump layer is added in the second convolution module and the third convolution module. In the bridging portion, it contains two sets of BN, ReLu and convolution operations. In the decoding part, three levels of up-sampling operation combination are included. The combination of upsampling operations includes upsampling, concatenating, BN, ReLu, convolution and jump layer operations. Finally, a convolution and Sigmod operation is performed to obtain the final road extraction result, which has a size of 224 × 224 × 1.

The current remote sensing image road extraction model of residual U-Net is still a symmetrical structure. In comparison, the work of reconstruction decoding is more difficult than that of encoding, but in residual U-Net and most current U-Net class segmentation models, the parameters of reconstruction decoding are not more than those of encoding, and even less parameters such as PSPNet are used. The reconstruction of decoding parameters is not more than the decoding parameters, but more complicated work is required, which finally causes the imbalance of the network coding capability and the decoding capability, and finally influences the effect of the network model.

TABLE 1 network architecture parameters for residual U-Net

Disclosure of Invention

The invention aims to provide a road extraction method, a road extraction device, road extraction equipment and a road extraction storage medium based on reconstruction bias U-Net, so as to solve the existing problems.

In order to achieve the above object, an embodiment of the present invention provides a road extraction method based on reconstruction bias U-Net, including

Acquiring a training set and a verification set;

constructing a U-Net neural network model, wherein the U-Net neural network model comprises an encoder and a decoder, the encoder comprises five convolution modules, and the decoder comprises four up-sampling reconstruction layers;

inputting the training set into a U-Net neural network model for training, and storing parameters of the network model to obtain an initial U-Net neural network model;

inputting the verification set into an initial U-Net neural network model and training by combining a loss function until the model converges to obtain a trained U-Net neural network model;

and inputting the image to be detected into the trained U-Net neural network model to obtain a road extraction result.

Further, the inputting the training set into the U-Net neural network model for training specifically includes:

inputting the training set into an encoder comprising five convolution modules in sequence, wherein the five convolution modules are sequentially connected with two modules comprising a convolution layer, a ReLu, a Dropout and a maximum pooling layer through three modules comprising the convolution layer, the ReLu and the maximum pooling layer;

inputting data output by the encoder into a first up-sampling reconstruction layer to output the data after passing through the first up-sampling reconstruction layer;

inputting the data after passing through the first up-sampling reconstruction layer into a splicing layer and two groups of convolution and ReLu combination layers so as to output the data after passing through the first combination layer;

inputting the data passing through the first combination layer into a second up-sampling reconstruction layer to output the data passing through the second up-sampling reconstruction layer;

inputting the data passing through the second up-sampling reconstruction layer into a splicing layer and two groups of convolution and ReLu combination layers to output the data passing through the second combination layer;

inputting the data after passing through the second combination layer into a third up-sampling reconstruction layer to output the data after passing through the third up-sampling reconstruction layer;

inputting the data after passing through the third up-sampling reconstruction layer into a splicing layer and two groups of convolution and ReLu combination layers so as to output the data after passing through the third combination layer;

inputting the data after passing through the third combination layer into a fourth up-sampling reconstruction layer to output the data after passing through the fourth up-sampling reconstruction layer;

and inputting the data subjected to the fourth up-sampling reconstruction layer into a splicing layer and five groups of convolution and ReLu combination layers, and then performing convolution and Sigmod operation to obtain an output result and train the output result.

Still further, the first upsampled reconstruction layer comprises four sets of operational combinations of upsampling, convolution and ReLu, the second upsampled reconstruction layer comprises five sets of operational combinations of upsampling, convolution and ReLu, the third upsampled reconstruction layer comprises four sets of operational combinations of upsampling, convolution and ReLu, and the fourth upsampled reconstruction layer comprises three sets of operational combinations of upsampling, convolution and ReLu.

Further, the loss function is calculated using a binary-loss-entropy loss function.

The invention also provides a road extraction device based on the reconstruction bias U-Net, which comprises the following components:

the acquisition module is used for acquiring a training set and a verification set;

the device comprises a building module, a reconstruction module and a processing module, wherein the building module is used for building a U-Net neural network model, the U-Net neural network model comprises an encoder and a decoder, the encoder comprises five convolution modules, and the decoder comprises four up-sampling reconstruction layers;

the first training module is used for inputting the training set into a U-Net neural network model for training and storing parameters of the network model to obtain an initial U-Net neural network model;

the second training module is used for inputting the verification set into an initial U-Net neural network model and training in combination with a loss function until the model converges to obtain a trained U-Net neural network model;

and the extraction module is used for inputting the image to be detected into the trained U-Net neural network model to obtain a road extraction result.

The invention also provides road extraction equipment based on the reconstruction bias U-Net, which comprises a memory and a processor, wherein a computer program is stored in the memory, and the processor is used for operating the computer program to realize the road extraction method based on the reconstruction bias U-Net.

The invention also provides a storage medium, which stores a computer program, and the computer program can be executed by a processor of the device where the storage medium is located, so as to realize the road extraction method based on the reconstruction bias U-Net.

The invention provides a road extraction method based on reconstruction bias U-Net, which comprises the following steps: acquiring a training set and a verification set; constructing a U-Net neural network model, wherein the U-Net neural network model comprises an encoder and a decoder, the encoder comprises five convolution modules, and the decoder comprises four up-sampling reconstruction layers; inputting the training set into a U-Net neural network model for training, and storing parameters of the network model to obtain an initial U-Net neural network model; inputting the verification set into an initial U-Net neural network model and training by combining a loss function until the model converges to obtain a trained U-Net neural network model; and inputting the image to be detected into the trained U-Net neural network model to obtain a road extraction result. The road extraction U-Net neural network model can effectively enhance the reconstruction capability of the network, so that the network has better logical reasoning capability for information such as shielding, the network coding capability and the network decoding capability are more balanced, and better segmentation precision and robustness can be obtained.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 is a schematic flow chart of a road extraction method based on a reconstruction bias U-Net according to a first embodiment of the present invention.

Fig. 2 is a schematic diagram of a network structure of the reconstruction bias U-Net according to the embodiment of the present invention.

Fig. 3 is a schematic flow chart of a road extraction device based on the reconstruction bias U-Net according to a second embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings of the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention.

Referring to fig. 1, a first embodiment of the present invention provides a road extraction method based on reconstruction bias U-Net, including:

s11, training set and verification set are obtained.

In this embodiment, the training set includes raw data and labeled data. The raw data and the marked data are specifically obtained by:

and acquiring the remote sensing image and the marked road pure map of the same geographic position.

And segmenting and screening the acquired remote sensing image and the marked road pure image, and obtaining marked data based on the screened road pure image to be used as a training set.

Extracting the characteristics of the remote sensing image, and superposing the characteristics and the original remote sensing image to obtain original data, wherein one part of the original data is used for a training set, and the other part of the original data is used for a verification set.

And S12, constructing a U-Net neural network model, wherein the U-Net neural network model comprises an encoder and a decoder, the encoder comprises five convolution modules, and the decoder comprises four up-sampling reconstruction layers.

In this embodiment, referring to fig. 2, the encoder is used for feature extraction, and the decoder is used for recovering the image size, where the encoder includes five convolution modules, the five convolution modules are sequentially connected by three modules including a convolution layer, ReLu, and a maximum pooling layer and two modules including a convolution layer, ReLu, Dropout, and a maximum pooling layer, and the decoder, i.e., the decoding reconstruction part, is divided into four upsampling reconstruction layers. Four sets of combinations of upsampling, convolution and ReLU operations are included in the first upsampled reconstruction layer. And combining the output result of the first up-sampling reconstruction layer through splicing, two groups of convolution and ReLU operations, and using the combined output result as the input of the second up-sampling reconstruction layer. The second up-sampling reconstruction layer adopts five groups of up-sampling, convolution and ReLU operation combination, and after output, the output is spliced, combined with two groups of convolution and ReLU operation and used as the input of the third up-sampling layer. The third upsampled reconstruction layer uses the same combination of operations as the first upsampled reconstruction layer, except for the number of filters and the size of the convolution kernel. The fourth layer of the up-sampling reconstruction comprises three groups of operation combinations of up-sampling, convolution and ReLu, the output of the operation combinations is spliced, five groups of operation combinations of convolution and ReLu are carried out, and then the operation combinations of convolution and Sigmod are carried out to obtain a final result.

And S13, inputting the training set into a U-Net neural network model for training, and storing parameters of the network model to obtain an initial U-Net neural network model.

In this embodiment, the initial U-Net neural network model is obtained by setting the training batch, the learning rate, and the parameters, and storing the parameters of the network model, for example, Adam and SGD may be used for algorithm training, and it should be noted that, the training may also be performed by other algorithms, and these schemes are all within the protection scope of the present invention.

And S14, inputting the verification set into an initial U-Net neural network model and training by combining a loss function until the model converges to obtain a trained U-Net neural network model.

In this embodiment, the verification set is input into the initial U-Net neural network model for extraction, the extraction result is compared with the real road map, and iterative computation is performed through loss function computation until the model converges. For example, the loss function of this embodiment uses binary-loss-entry with reference to fig. 2 and table 2, for example, the input size of the network model of this embodiment is 256 × 256 × 3, and 256 × 256 × 3 is sequentially input into three convolution modules including a convolution layer, ReLu, and a maximum pooling layer, where the convolution kernel in the convolution layer of the three convolution modules is 3 × 3, the kernel of the maximum pooling layer is 2 × 2, and the outputs of the three convolution modules are 128 × 128 × 64, 64 × 64 × 128, and 32 × 32 × 256.

And sequentially inputting the output result 32 × 32 × 256 into two convolution modules comprising convolution layers, ReLu, Dropout and a maximum pooling layer, wherein the convolution kernel of the convolution layers of the two convolution modules is 3 × 3, the coefficient of Dropout is 0.5, the kernel of the maximum pooling layer is 3 × 3, and the outputs of the two convolution modules are sequentially 16 × 16 × 512 and 16 × 16 × 1024.

Inputting the 16 × 16 × 1024 data into a first up-sampling reconstruction layer, wherein the first up-sampling reconstruction layer comprises four groups of operation combinations of up-sampling, convolution and ReLu, and the convolution kernel of each group is respectively

The output results are respectively 32 multiplied by 512, 32 multiplied by 128, 32 multiplied by 64 and 32 multiplied by 32, after the output results are spliced, combined by two groups of convolutions and ReLu operation to output 32 multiplied by 512, wherein, the two groups of convolutions are both combinedIs 3 × 3/256.

Inputting the output 32 x 512 into a second up-sampling reconstruction layer, wherein the second up-sampling reconstruction layer comprises five groups of operation combinations of up-sampling, convolution and ReLu, and the convolution kernel of each group is respectively

The output results are respectively 64 × 64 × 256, 64 × 064 × 164, 64 × 264 × 32, 64 × 64 × 16 and 64 × 64 × 8, and the output results are subjected to splicing and two sets of convolution and ReLu operations to output 64 × 64 × 256, wherein the two sets of convolution are both 3 × 3/256.

Inputting the 64 x 256 to a third up-sampling reconstruction layer, wherein the third up-sampling reconstruction layer comprises four groups of operation combinations of up-sampling, convolution and ReLu, and the convolution kernel of each group is respectively

The output results are respectively 128 × 128 × 128, 128 × 0128 × 32, 128 × 128 × 16 and 128 × 128 × 2, and the output results are subjected to splicing, two groups of convolutions and ReLu operation combination to output 128 × 128 × 128, wherein the two groups of convolutions are both 3 × 3/128.

Inputting the 128 x 128 to a fourth up-sampling reconstruction layer, wherein the fourth up-sampling reconstruction layer comprises three groups of operation combinations of up-sampling, convolution and ReLu, and the convolution kernel of each group is respectively

The output results are respectively 256 × 256 × 64, 256 × 0256 × 116 and 256 × 2256 × 38, the output results are subjected to splicing, five groups of convolution and ReLu operation combination, and then subjected to the last convolution and Sigmod operation, and finally the output result is 256 × 256 × 1, wherein the five groups of convolution are sequentially 3 × 3/64, 3 × 3/64, 3 × 3/3, 1 × 1/3 and 1 × 1/3, and the last convolution is 1 × 1/1. It should be noted that the loss function may be other functions, and these schemes are within the scope of the present invention.

And S15, inputting the image to be detected into the trained U-Net neural network model to obtain a road extraction result.

The road extraction method based on the reconstruction bias U-Net provided by the embodiment comprises the following steps: acquiring a training set and a verification set; constructing a U-Net neural network model, wherein the U-Net neural network model comprises an encoder and a decoder, the encoder comprises five convolution modules, and the decoder comprises four up-sampling reconstruction layers; inputting the training set into a U-Net neural network model for training, and storing parameters of the network model to obtain an initial U-Net neural network model; inputting the verification set into an initial U-Net neural network model and training by combining a loss function until the model converges to obtain a trained U-Net neural network model; and inputting the image to be detected into the trained U-Net neural network model to obtain a road extraction result. The road extraction reconstruction bias U-Net neural network model can effectively enhance the reconstruction capability of the network, so that the network has better logical reasoning capability for information such as shielding, the network coding capability and the network decoding capability are more balanced, and better segmentation precision and robustness can be obtained.

TABLE 2 specific parameters and operations used by the various layers in the network model

The second embodiment of the present invention provides a road extraction device based on reconstruction bias U-Net, referring to FIG. 3, including

An obtaining module 110 is configured to obtain a training set and a verification set.

In this embodiment, the training set obtained by the obtaining module 110 includes raw data and labeled data. The raw data and the marked data are specifically obtained by:

And segmenting and screening the obtained remote sensing image and the marked road pure image, and obtaining marked data based on the screened road pure image to be used as a training set.

And a building module 120, configured to build a U-Net neural network model, where the U-Net neural network model includes an encoder and a decoder, the encoder includes five convolution modules, and the decoder includes four upsampled reconstruction layers.

And the first training module 130 is configured to input the training set into a U-Net neural network model for training, and store parameters of the network model to obtain an initial U-Net neural network model.

And the second training module 140 is configured to input the verification set into the initial U-Net neural network model and train in combination with the loss function until the model converges to obtain a trained U-Net neural network model.

And the extraction module 150 is used for inputting the image to be detected into the trained U-Net neural network model to obtain a road extraction result.

The embodiment provides a road extraction element based on rebuild heavy U-Net, includes: an obtaining module 110, configured to obtain a training set and a verification set; a building module 120, configured to build a U-Net neural network model, where the U-Net neural network model includes an encoder and a decoder, the encoder includes five convolution modules, and the decoder includes four upsampled reconstruction layers; a first training module 130, configured to input the training set into a U-Net neural network model for training, and store parameters of the network model to obtain an initial U-Net neural network model; the second training module 140 is configured to input the verification set into an initial U-Net neural network model and train in combination with a loss function until the model converges to obtain a trained U-Net neural network model; and the extraction module 150 is used for inputting the image to be detected into the trained U-Net neural network model to obtain a road extraction result. The road extraction reconstruction bias U-Net neural network model can effectively enhance the reconstruction capability of the network, so that the network has better logical reasoning capability for information such as shielding, the network coding capability and the network decoding capability are more balanced, and better segmentation precision and robustness can be obtained.

The invention provides a road extraction device based on reconstruction bias U-Net, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor is used for operating the computer program to realize the road extraction method based on reconstruction bias U-Net.

A fourth embodiment of the present invention provides a storage medium, where a computer program is stored, where the computer program can be executed by a processor of a device in which the storage medium is located, so as to implement the road extraction method based on the reconstruction bias U-Net.

In the embodiments provided in the embodiments of the present invention, it should be understood that the apparatus and method provided may be implemented in other ways. The apparatus and method embodiments described above are illustrative only, as the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, an electronic device, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

While the invention has been particularly shown and described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A road extraction method based on reconstruction bias U-Net is characterized by comprising the following steps:

acquiring a training set and a verification set;

2. The road extraction method based on the reconstruction bias U-Net according to claim 1, wherein the training by inputting the training set into the U-Net neural network model specifically comprises:

3. The method of claim 2, wherein the first upsampled reconstruction layer comprises four sets of operational combinations of upsampling, convolution and ReLu, the second upsampled reconstruction layer comprises five sets of operational combinations of upsampling, convolution and ReLu, the third upsampled reconstruction layer comprises four sets of operational combinations of upsampling, convolution and ReLu, and the fourth upsampled reconstruction layer comprises three sets of operational combinations of upsampling, convolution and ReLu.

4. The method of claim 1, wherein the loss function is calculated using a binary-loss-entropy loss function.

5. A road extraction device based on reconstruction bias U-Net is characterized by comprising:

6. The road extraction device based on the reconstruction bias U-Net according to claim 5, wherein the first training module is configured to input the training set into a U-Net neural network model for training, and specifically:

7. The reconstruction bias U-Net based road extraction device of claim 6, wherein the first up-sampling reconstruction layer comprises four sets of up-sampling, convolution and ReLu operational combinations, the second up-sampling reconstruction layer comprises five sets of up-sampling, convolution and ReLu operational combinations, the third up-sampling reconstruction layer comprises four sets of up-sampling, convolution and ReLu operational combinations, and the fourth up-sampling reconstruction layer comprises three sets of up-sampling, convolution and ReLu operational combinations.

8. The road extraction device based on reconstruction bias U-Net according to claim 5, wherein the loss function is calculated by using binary-loss-entropy loss function.

9. A road extraction device based on reconstruction bias U-Net, comprising a memory and a processor, wherein the memory stores a computer program, and the processor is configured to run the computer program to implement a road extraction method based on reconstruction bias U-Net according to any one of claims 1 to 4.

10. A storage medium, characterized in that the storage medium stores a computer program, which can be executed by a processor of a device in which the storage medium is located, to implement a road extraction method based on reconstruction bias U-Net according to any one of claims 1 to 4.